Knight-coderr
/

Employee_Salary_Prediction

Model card Files Files and versions

Employee_Salary_Prediction / README.md

Knight-coderr's picture

Create README.md

cdfc4ed verified over 1 year ago

|

history blame contribute delete

1.32 kB

	Dataset:
	Use the Data Science Salaries 2023 dataset available on Kaggle: Data Science Salaries
	2023.
	Tasks and Requirements:
	1. Data Exploration and Preprocessing:
	o Load the dataset and perform exploratory data analysis (EDA).
	o Clean the data, handle missing values, and encode categorical variables.
	o Split the data into training and testing sets.
	2. Model Training:
	o Train multiple machine learning models (e.g., Linear Regression, Decision
	Trees, Random Forest, Gradient Boosting).
	o Use MLflow to track experiments, including parameters, metrics, and artifacts.
	o Evaluate the models using appropriate metrics (e.g., RMSE, MAE, R²).
	3. Model Selection and Optimization:
	o Compare the performance of different models.
	o Optimize the best-performing model using hyperparameter tuning.
	o Record all experiments and their results using MLflow.
	4. Streamlit Application:
	o Create a Streamlit app to interact with the trained model.
	o The app should allow users to input features and get salary predictions.
	o Display relevant model performance metrics and visualizations in the app.
	5. Model Registration and Deployment:
	o Register the best model in the MLflow Model Registry.
	o Deploy the model using Hugging Face Spaces.
	o Ensure the deployed model is accessible via an API for inference.