codealchemist01
/

ecologia-electricity-model

+---
+library_name: sklearn
+tags:
+- energy-consumption
+- regression
+- random-forest
+- xgboost
+- building-energy
+- sustainability
+- carbon-footprint
+pipeline_tag: regression
+---
+# Ecologia Electricity Consumption Model
+## Model Description
+This model predicts **electricity_consumption (kWh)** for buildings using machine learning ensemble methods.
+- **Model Architecture**: Random Forest Regressor (Ensemble)
+- **Task**: Regression (Energy Consumption Prediction)
+- **Target Variable**: electricity_consumption (kWh)
+- **Input Features**: 22 features
+- **Training Dataset**: Building Data Genome Project 2
+- **Training Samples**: ~15 million
+## Model Performance
+### Random Forest Model
+- **RMSE**: 37.6519
+- **MAE**: 17.5059
+- **R² Score**: 0.9587
+### XGBoost Model
+- **RMSE**: 59.3440
+- **MAE**: 29.7273
+- **R² Score**: 0.8973
+### Best Model
+The best performing model (based on validation RMSE) is saved as `electricity_model.joblib`.
+## Training Details
+### Dataset
+- **Source**: [Building Data Genome Project 2](https://www.kaggle.com/datasets/claytonmiller/buildingdatagenomeproject2)
+- **Training Samples**: ~15 million
+- **Data Preprocessing**:
+  - Outlier removal (99th percentile)
+  - Feature engineering (temporal, building, weather features)
+  - Missing value imputation
+  - Normalization
+### Training Method
+- **Algorithm**: Ensemble (Random Forest + XGBoost)
+- **Best Model Selection**: Based on validation RMSE
+- **Cross-Validation**: Train/Validation/Test split (60/20/20)
+- **Hyperparameters**: Optimized for large-scale datasets
+### Feature Engineering
+The model uses 22 engineered features including:
+- **Building Features**: Type, area, age, location
+- **Temporal Features**: Hour, day, month, season, day of week
+- **Weather Features**: Temperature, humidity, dew point
+- **Interaction Features**: Building-weather interactions
+- **Lag Features**: Previous consumption patterns
+## Usage
+### Installation
+```bash
+pip install scikit-learn xgboost joblib huggingface_hub
+```
+### Load Model
+```python
+from huggingface_hub import hf_hub_download
+import joblib
+# Download model and features
+model_path = hf_hub_download(
+    repo_id="codealchemist01/ecologia-electricity-model",
+    filename="electricity_model.joblib",
+    token="YOUR_HF_TOKEN"  # Optional if public
+)
+features_path = hf_hub_download(
+    repo_id="codealchemist01/ecologia-electricity-model",
+    filename="electricity_features.joblib",
+    token="YOUR_HF_TOKEN"  # Optional if public
+)
+# Load model and features
+model = joblib.load(model_path)
+feature_columns = joblib.load(features_path)
+```
+### Prediction Example
+```python
+import pandas as pd
+import numpy as np
+# Prepare input data (example)
+input_data = pd.DataFrame({
+    'building_type': ['Office'],
+    'area_sqm': [1000],
+    'year_built': [2020],
+    'temperature': [20.5],
+    'humidity': [65],
+    'hour': [14],
+    'day_of_week': [1],
+    'month': [6],
+    # ... other required features
+})
+# Ensure all features are present
+for col in feature_columns:
+    if col not in input_data.columns:
+        input_data[col] = 0
+# Select features in correct order
+input_data = input_data[feature_columns]
+# Make prediction
+prediction = model.predict(input_data)
+print(f"Predicted electricity_consumption (kWh): {prediction[0]:.2f}")
+```
+## Model Limitations
+- Model performance may vary based on building characteristics and regional differences
+- Training data is primarily from North American buildings
+- Predictions are estimates and should be validated with actual consumption data
+- Model requires all input features to be provided
+## Ethical Considerations
+- Model is designed to help reduce energy consumption and carbon footprint
+- No personal or sensitive data is used in training
+- Model predictions should be used responsibly for sustainability purposes
+## Citation
+If you use this model, please cite:
+```bibtex
+@software{ecologia_energy_model,
+  title = {Ecologia Electricity Consumption Model},
+  author = {Ecologia Energy Team},
+  year = {2024},
+  url = {https://huggingface.co/codealchemist01/ecologia-electricity-model},
+  note = {Trained on Building Data Genome Project 2 dataset}
+}
+```
+## License
+This model is released under the MIT License.
+## Contact
+For questions or issues, please open an issue on the repository or contact the Ecologia Energy team.
+## Acknowledgments
+- Building Data Genome Project 2 dataset creators
+- scikit-learn and XGBoost communities
+- HuggingFace for model hosting
+---
+*This model is part of the Ecologia sustainability platform for energy consumption prediction and carbon footprint calculation.*