Machine Learning-Based Analysis of Gene Expression Profiles in Breast Cancer¶

Original Educational Template – Provided by course instructors
Mohamed Hussein – Code restructuring, expanding functionality, data analysis enhancement, and GitHub publication
Date: 2025-9-02

Notebook 0: Load Required Libraries¶

This notebook initializes the Python environment by loading all libraries required for the full pipeline, including data handling, preprocessing, feature selection, modeling, evaluation, and visualization.

0.1 Python Libraries¶

0.1.1 Data Handling¶

In [6]:
import pandas as pd                                
import numpy as np                              

0.1.2 Data Visualization¶

In [8]:
import matplotlib.pyplot as plt
import seaborn as sns

0.1.3 Preprocessing¶

In [10]:
from sklearn.preprocessing import LabelEncoder             
from sklearn.preprocessing import MinMaxScaler             

0.1.4 Feature Selection¶

In [12]:
from sklearn.feature_selection import SelectKBest, mutual_info_classif

0.1.5 Classification Models¶

In [14]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC

0.1.6 Performance Metrics¶

In [16]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

0.1.7 Model Saving / Loading¶

In [18]:
import pickle
import joblib

0.1.8 Train/Test Splitting & Cross-Validation¶

In [20]:
from sklearn.model_selection import train_test_split, StratifiedKFold, GridSearchCV

0.1.9 Dimensionality Reduction¶

In [22]:
from sklearn.decomposition import PCA

0.2 Confirmation¶

In [24]:
print("pandas version:", pd.__version__)
print("numpy version:", np.__version__)

import sklearn
print("scikit-learn version:", sklearn.__version__)

import matplotlib
print("matplotlib version:", matplotlib.__version__)
print("seaborn version:", sns.__version__)
pandas version: 2.2.2
numpy version: 1.26.4
scikit-learn version: 1.4.2
matplotlib version: 3.8.4
seaborn version: 0.13.2

Environment Information (Python & Jupyter version)¶

In [26]:
import platform
import notebook

# Python version 
python_version = platform.python_version()

# Jupyter Notebook version
jupyter_version = notebook.__version__

print("Python version:", python_version)
print("Jupyter Notebook version:", jupyter_version)
Python version: 3.12.3
Jupyter Notebook version: 7.0.8

Next Step: Notebook 1 will focus on Data Exploration & Cleaning.

Export Notebook 0 to HTML¶

In [29]:
!jupyter nbconvert --to html "Notebook_0_Load_Libraries.ipynb" --output "Notebook_0_Load_Libraries.html"
[NbConvertApp] Converting notebook Notebook_0_Load_Libraries.ipynb to html
[NbConvertApp] Writing 292670 bytes to Notebook_0_Load_Libraries.html