|
|
--- |
|
|
title: README |
|
|
emoji: π |
|
|
colorFrom: indigo |
|
|
colorTo: yellow |
|
|
sdk: static |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# OSS-Forge |
|
|
|
|
|
**OSS-Forge** is an open research initiative focused on *trustworthy, secure, and transparent AI-assisted software engineering*. |
|
|
We develop and publish: |
|
|
|
|
|
- **static and dynamic analyzers** for AI-generated code |
|
|
- **benchmarks and datasets** for software vulnerabilities, defects, exploits, and shellcode |
|
|
- **evaluation frameworks** for correctness, robustness, and data poisoning |
|
|
- **models and reproducible pipelines** for secure code generation |
|
|
- **experimental tools and artifacts** from peer-reviewed scientific publications |
|
|
|
|
|
Our mission is to build a transparent, verifiable, and secure ecosystem for integrating Large Language Models (LLMs) into software development, especially in safety-critical and security-sensitive contexts. |
|
|
|
|
|
--- |
|
|
|
|
|
## What You Will Find Here |
|
|
|
|
|
This organization hosts resources from multiple research projects and publications in AI security, software engineering, and code generation. Current categories include: |
|
|
|
|
|
### Static Analyzers & Security Tools |
|
|
- **DeVAIC** β Fast static analysis for detecting vulnerabilities in Python code |
|
|
- **PatchitPy** β Automated patching of vulnerable Python code via pattern-based transformations |
|
|
- **ACCA** β Automated correctness assessment of AI-generated code using symbolic execution |
|
|
|
|
|
### Datasets for Security & Software Engineering |
|
|
- **PyResBugs** β 5,007 residual Python bugs with NL descriptions |
|
|
- **Shellcode_IA32** β The largest curated dataset of IA-32 shellcode snippets |
|
|
- **PoisonPy** β Dataset supporting targeted data-poisoning attacks |
|
|
- **Human vs AI Code** β Defects, vulnerabilities, and complexity analysis at scale |
|
|
|
|
|
### Robustness, Data Quality & Industrial Code Generation |
|
|
- **Residual Bug Generation from Natural Language** β Frameworks for generating realistic residual defects from NL descriptions |
|
|
- **Impact of Data Quality on Code Models** β Empirical studies on robustness, poisoning resilience, and dataset quality |
|
|
- **Industrial Code Generation** β Models for domain-specific code synthesis (e.g., VHDL generation from natural language) |
|
|
|
|
|
Our repositories include code, experimental scripts, datasets, and reproducibility materials. |
|
|
|
|
|
--- |
|
|
|
|
|
## Research Themes |
|
|
|
|
|
Our work spans four interconnected areas: |
|
|
|
|
|
1. **Security of AI-generated Code** |
|
|
Vulnerability detection, automated patching, exploit generation, and robustness testing. |
|
|
|
|
|
2. **Trustworthy LLM Evaluation** |
|
|
Correctness, equivalence checking, symbolic execution, reproducible benchmarks. |
|
|
|
|
|
3. **Software Engineering with AI** |
|
|
Defect analysis, complexity metrics, orthogonal defect classification (ODC). |
|
|
|
|
|
4. **Adversarial ML for Code Models** |
|
|
Data poisoning, robustness stress-testing, unsafe pattern injection. |
|
|
|
|
|
All research artifacts are peer-reviewed and associated with publications at DSN, ISSRE, ICPC, IST, EMSE, JSS, AUSE, and other venues. |
|
|
|
|
|
--- |
|
|
|
|
|
## Publications Powered by These Repositories |
|
|
|
|
|
A non-exhaustive list includes works presented at: |
|
|
|
|
|
- **IEEE/IFIP DSN** |
|
|
- **IEEE ISSRE** |
|
|
- **IEEE/ACM ICPC** |
|
|
- **Empirical Software Engineering (EMSE)** |
|
|
- **Information and Software Technology (IST)** |
|
|
- **Automated Software Engineering (AUSE)** |
|
|
- **Journal of Systems and Software (JSS)** |
|
|
|
|
|
Full references are available inside each corresponding repository. |
|
|
|
|
|
--- |
|
|
|
|
|
## Contributing |
|
|
|
|
|
We encourage contributions from the research and practitioner community. |
|
|
|
|
|
You can contribute by: |
|
|
|
|
|
- submitting new datasets |
|
|
- improving static analysis rules |
|
|
- adding benchmarks or experimental scripts |
|
|
- reporting issues or proposing new features |
|
|
|
|
|
Please open discussions or pull requests inside the relevant repository. |
|
|
|
|
|
--- |
|
|
|
|
|
## Contact |
|
|
|
|
|
OSS-Forge is developed by a joint research team from the **University of North Carolina at Charlotte (UNCC)** and the **University of Naples Federico II**. |
|
|
|
|
|
### Scientific Leadership |
|
|
- Prof. [Domenico Cotroneo](https://webpages.charlotte.edu/dcotrone/) β UNCC |
|
|
|
|
|
### Core Research Contributors |
|
|
- Dr. [Pietro Liguori](http://wpage.unina.it/pietro.liguori/) β University of Naples Federico II |
|
|
- [Cristina Improta](http://wpage.unina.it/cristina.improta/) β University of Naples Federico II |
|
|
- Ph.D. students and graduate researchers and contributors from the DESSERT Research group β University of Naples Federico II |
|
|
|
|
|
--- |