Re-check multilevel fallback validation test suite for OnCall.ai
fbad237
YanBoChencommited on
feat(user_prompt): Enhance Medical Query Processing Pipeline
f24fd2b
YanBoChencommited on
feat(user_prompt): update UserPromptProcessor to integrate Llama3-Med42-70B and enhance query validation; add unit tests for condition extraction and matching mechanisms
30fc9ee
YanBoChencommited on
feat(retrieval): add sliding window search method for enhanced semantic search
a1e2d00
YanBoChencommited on
feat(user_prompt): add keyword index checking and enhance medical query validation (add validate_medical_query() and _check_keyword_in_index(), based on implementation_todo_20250730_user_prompt.md
fa23be2
YanBoChencommited on
feat(llm_clients): enhance MeditronClient to support local model loading and improve error handling
4c919d2
YanBoChencommited on
feat(.gitignore): add cache directory to ignored files
acc25ea
YanBoChencommited on
WIP(llm_clients): add MeditronClient for medical query processing with Hugging Face integration
7282410
YanBoChencommited on
🚀 Implement Advanced Condition Extraction for Medical Query Processing
c414f60
YanBoChencommited on
refactor(deduplication): change deduplication logic from distance-based to exact text matching
37c6713
YanBoChencommited on
feat(retrieval) 1st MVP: enhance search logging and deduplication logic with distance threshold
890989b
YanBoChencommited on
WIP: feat(retrieval): implement basic vector retrieval system for medical documents
6c249e5
YanBoChencommited on
refactor(data_processing): enhance chunking and embedding generation
69b7911
YanBoChencommited on
WIP: Enhance dual keyword chunking to include pre-calculated metadata for treatment chunks
c0317b2
YanBoChencommited on
Add comprehensive tests for chunk quality analysis and embedding validation
f3ac7d9
YanBoChencommited on
Add requirements.txt for test dependencies
8942859
YanBoChencommited on
chore(.gitignore): update ignored files for environment and editor settings
d3ad7d4
YanBoChencommited on
Add new optimized subsets comparison reports in CSV and JSONL formats
fcd6404
YanBoChencommited on
chore(.gitignore): add json and png file extensions to ignored files
35ef528
YanBoChencommited on
chore: remove obsolete analysis files and plots for emergency and treatment datasets
95a7e44
YanBoChencommited on
refactor(.gitignore): reorganize and expand ignored files for clarity and completeness
2e1a43b
YanBoChencommited on
Remove obsolete embedding and index files; add comprehensive embedding test analysis and validation suite
775f8ea
YanBoChencommited on
feat(data_processing): Implement token length control with semantic preservation
922ed80
YanBoChencommited on
refactor(data_processing): add token-based chunking strategy for improved keyword context
6083d96
YanBoChencommited on
refactor(data_processing): optimize chunking strategy with token-based approach
87dcd9d
YanBoChencommited on
fix: add .vscode/ to .gitignore
e72f098
YanBoChencommited on
fix: update .gitignore to include docs/ and *.pyc
3a40790
YanBoChencommited on
feat(data-processing): implement data processing pipeline with embeddings
68cfce0
YanBoChencommited on
Merge pull request #1 from YanBoChen0928/dataprocessing
feat: add emergency subset analysis plots and statistics report
d72142e
YanBoChencommited on
refactor: migrate special terms to JSON configuration
8de0937
YanBoChencommited on
feat: implement special_term (added) emergency keyword matching and metadata extraction
9829a46
YanBoChencommited on
feat: update treatment analysis with keyword density calculations and enhanced visualization(test previous 2 dataset, especially treatment_subset)
654aa66
YanBoChencommited on
feat: add integrity check report for dataset analysis (previous commit test result)
04a03be
YanBoChencommited on
WIP: during pre-process dataset, when doing dataset_treatment exploration, some abnormality happen, thus we now create certain test script to identify the problem
d37f4b2
YanBoChencommited on
feat: add comprehensive treatment analysis statistics and update treatment keywords
2ee61dc
YanBoChencommited on
WIP: add dual keyword and text length distribution plots for treatment subset analysis
a5bcfa7
YanBoChencommited on
WIP: Try to analysis treatment_subset enhance emergency and treatment filtering scripts with metadata and analysis functionality
7d8970e
YanBoChencommited on
feat: update emergency subset analysis scripts and visualizations, enhance statistics, and fix keyword files
ee06c0f
YanBoChencommited on
WIP: add emergency subset analysis scripts and visualizations