Az-r-ow commited on
Commit
117f801
·
1 Parent(s): a4bde73

doc: Updated the followup-notes files with notes taken locally

Browse files
Files changed (1) hide show
  1. followup-notes.md +37 -0
followup-notes.md CHANGED
@@ -14,6 +14,43 @@
14
 
15
  ### Questions
16
 
 
 
 
17
  ### Answers
18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  ### Remarks
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ### Questions
16
 
17
+ - Any recommendations on vectorizing the sentences ?
18
+ - Should we go for character or word based vectorization ?
19
+
20
  ### Answers
21
 
22
+ - A good start would be with word based vectorization it's simpler and involves less dimensions
23
+
24
+ ### Remarks
25
+
26
+ Our idea to train and compare different models seems interesting to Prof. Nassar and he would like to see an F1-score comparison chart.
27
+
28
+ ## Follow up 2
29
+
30
+ **Date:** Friday 29 November 2024
31
+
32
+ ### Work done
33
+
34
+ - HMM
35
+ - Trained and evaluated LSTM
36
+ - Trained and evaluated BiLTM
37
+ - Starting to train BERT
38
+
39
  ### Remarks
40
+
41
+ Prof. Nassar said to use `cmarkea/distilcamembert-base` instead of `camembert-base` because it converges faster.
42
+
43
+ ## Follow up 3
44
+
45
+ **Date:** Friday 17 January 2025
46
+
47
+ ### Work done
48
+
49
+ - Trained and evaluated CamemBERT
50
+ - Tested and evaluated LSTM and BiLSTM with POS as extra features
51
+ - Modified the interface to include a selected for the model
52
+ - Refactored the interface and added tabs for files with multiple sentences
53
+
54
+ ### Remarks
55
+
56
+ Prof. Nassar was happy with our work and excited to see the final product on the keynote.