Beijuka commited on
Commit
0f922c9
·
verified ·
1 Parent(s): e55c550

Upload folder using huggingface_hub

Browse files
.dockerignore ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .git
2
+ __pycache__
3
+ *.pyc
4
+ .venv
5
+ venv
6
+ .env
7
+ node_modules
8
+ results
9
+ data
10
+ *.ckpt
11
+ *.h5
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ sample_files/Screenshot[[:space:]]2024-07-13[[:space:]]163331.png filter=lfs diff=lfs merge=lfs -text
37
+ sample_files/alperen_celik_14_08.pdf filter=lfs diff=lfs merge=lfs -text
38
+ sample_files/medium_article_image.jpg filter=lfs diff=lfs merge=lfs -text
Dockerfile ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ # Install system packages required for OCR (Tesseract, poppler for PDF tools)
4
+ RUN apt-get update \
5
+ && apt-get install -y --no-install-recommends \
6
+ tesseract-ocr \
7
+ libtesseract-dev \
8
+ libleptonica-dev \
9
+ pkg-config \
10
+ poppler-utils \
11
+ build-essential \
12
+ git \
13
+ && rm -rf /var/lib/apt/lists/*
14
+
15
+ # Copy and install Python dependencies
16
+ COPY requirements.txt /tmp/requirements.txt
17
+ RUN python -m pip install --upgrade pip && \
18
+ pip install --no-cache-dir -r /tmp/requirements.txt
19
+
20
+ # Copy application
21
+ COPY . /app
22
+ WORKDIR /app
23
+
24
+ # Expose default port (Spaces will set PORT env var)
25
+ ENV PORT=7860
26
+
27
+ # Run Streamlit app on container start
28
+ CMD bash -lc "streamlit run streamlit_app.py --server.port ${PORT} --server.address 0.0.0.0"
README.md CHANGED
@@ -1,10 +1,226 @@
 
 
1
  ---
2
- title: Ocr
3
- emoji:
4
- colorFrom: blue
5
- colorTo: green
6
  sdk: docker
 
 
7
  pinned: false
8
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
1
+ # OCRInsight
2
+
3
  ---
4
+ title: OCRInsight
5
+ emoji: "🧾"
6
+ colorFrom: "0A84FF"
7
+ colorTo: "7C3AED"
8
  sdk: docker
9
+ sdk_version: "1.0"
10
+ app_file: streamlit_app.py
11
  pinned: false
12
  ---
13
+ This Streamlit application allows users to perform OCR (Optical Character Recognition) using multiple open-source OCR engines and optionally process the OCR results using LLMs (Large Language Models). Users can compare the outputs of different OCR models and perform tasks such as summarization or text generation based on the OCR results.
14
+
15
+ Here the link of comprehensive explanation: (https://medium.com/@alperenclk/ocrinsight-building-a-modular-ocr-and-llm-application-f3d3a1ea7a18)
16
+ ![alt text](https://github.com/Alperenclk/All_OCR-s_tools/blob/main/sample_files/sample_screen.png?raw=true)
17
+
18
+ ## Features
19
+ ### Multiple OCR Engines Supported:
20
+
21
+ * EasyOCR
22
+ * DocTR
23
+ * Tesseract OCR
24
+ * PaddleOCR
25
+
26
+ #### Optional LLM Processing:
27
+
28
+ Use models like llama3.1, llama3, gemma2 via Ollama.
29
+ Perform tasks such as summarization or text generation based on OCR results.
30
+
31
+ #### Compare OCR Outputs:
32
+
33
+ Select multiple OCR models to compare their outputs side by side.
34
+ #### Save Outputs:
35
+
36
+ Option to save OCR and LLM outputs to text files.
37
+
38
+ ## Installation
39
+ ### Prerequisites
40
+ - Python 3.7 or higher
41
+ - pip package manager
42
+
43
+ ### Clone the Repository
44
+
45
+ ```bash
46
+ git clone https://github.com/Alperenclk/OCRInsight-open-source-OCRs-Plus-LLM.git
47
+ cd ocr-llm-app
48
+ ```
49
+
50
+ ### Create a Virtual Environment (Recommended)
51
+ ```bash
52
+ python -m venv venv
53
+ source venv/bin/activate # On Windows use: venv\Scripts\activate
54
+ ```
55
+
56
+ ### Install Required Python Packages
57
+ #### Install the required packages using pip:
58
+
59
+ ```bash
60
+ pip install -r requirements.txt
61
+ ```
62
+ Note: The requirements.txt file includes basic dependencies. Depending on the OCR engines and LLM support you want to use, you may need to install additional dependencies as described below.
63
+
64
+ ## Install OCR Engine Dependencies
65
+
66
+ ### EasyOCR
67
+ ```bash
68
+ pip install easyocr
69
+ ```
70
+
71
+ ### DocTR
72
+ ```bash
73
+ pip install python-doctr[torch]
74
+ ```
75
+ Note: For GPU support, ensure that PyTorch is installed with CUDA support.
76
+
77
+ ### Tesseract OCR
78
+ Install Tesseract OCR Engine:
79
+
80
+ #### Windows:
81
+
82
+ Download the Tesseract installer from UB Mannheim: <https://github.com/UB-Mannheim/tesseract/wiki>.
83
+
84
+ **Run the installer and follow the instructions.
85
+ Note the installation path (e.g., C:\Program Files\Tesseract-OCR\tesseract.exe).
86
+ Update the pytesseract.pytesseract.tesseract_cmd variable in ocr_engines.py to point to the Tesseract executable.**
87
+
88
+ #### macOS:
89
+
90
+ ```bash
91
+ brew install tesseract
92
+ ```
93
+ #### Ubuntu/Linux:
94
+
95
+ ``` bash
96
+ sudo apt-get update
97
+ sudo apt-get install tesseract-ocr
98
+ ```
99
+
100
+ ##### Install Python Wrapper:
101
+
102
+ ```bash
103
+ pip install pytesseract
104
+ ```
105
+ ##### Language Data Files:
106
+
107
+ Ensure that the language data files for the languages you intend to use are installed. For example, to install Turkish language data on Ubuntu:
108
+
109
+ ```bash
110
+ sudo apt-get install tesseract-ocr-tur
111
+ ```
112
+
113
+ ### PaddleOCR
114
+ #### Install PaddlePaddle:
115
+
116
+ #### CPU Version:
117
+
118
+ ```bash
119
+ pip install paddlepaddle
120
+ ```
121
+ #### GPU Version:
122
+
123
+ Refer to the PaddlePaddle Installation Guide for GPU support.
124
+
125
+ ### Install PaddleOCR:
126
+
127
+ ```bash
128
+ pip install paddleocr
129
+ ```
130
+
131
+ ## Install LLM Dependencies (Optional)
132
+ If you want to use the LLM features, install **Ollama**:
133
+
134
+ ```bash
135
+ pip install ollama
136
+ ```
137
+ Note: If you do not wish to use the LLM features, **you can skip this step**. The application will work in OCR-only mode.
138
+
139
+ ## Usage
140
+ ### Run the Application
141
+ ```bash
142
+ streamlit run app.py
143
+ ```
144
+
145
+ ## Application Interface
146
+ ### Settings Sidebar:
147
+
148
+ **Select Device:** Choose between CPU and GPU (if available).
149
+
150
+ **Language Selection:** Choose the language for OCR processing.
151
+
152
+ **Select OCR Models:** Choose one or more OCR models to use.
153
+
154
+ **LLM Model Selection:** Choose an LLM model or select "Only OCR Mode" to disable LLM features.
155
+
156
+ **LLM Command and Task Type:** Enter commands and select tasks if LLM is enabled.
157
+
158
+ **Save Outputs:** Option to save OCR and LLM outputs to files.
159
+
160
+ ### Main Area:
161
+
162
+ **File Upload:** Upload a PDF or image file for OCR processing.
163
+
164
+ **OCR Results:** View the OCR results from the selected models.
165
+
166
+ **LLM Processing:** Perform LLM processing on the combined OCR text (if enabled).
167
+
168
+ ## Notes
169
+ **Language Support:**
170
+
171
+ Ensure that the necessary language data files or models are installed for each OCR engine you intend to use.
172
+ Some OCR engines may require specific language codes or configurations.
173
+
174
+ **GPU Support:**
175
+
176
+ For GPU acceleration, ensure that your hardware supports it and that the necessary libraries (e.g., CUDA) are installed.
177
+ Not all OCR engines support GPU acceleration.
178
+
179
+ **Performance:**
180
+
181
+ Processing multiple OCR engines simultaneously may consume significant resources.
182
+ Processing large files or images may take longer.
183
+ Modular Code Structure
184
+ The application is structured modularly to enhance maintainability and extensibility.
185
+
186
+ **app.py:** The main Streamlit application script.
187
+
188
+ **ocr_engines.py:** Contains functions to initialize and perform OCR using different engines.
189
+
190
+ **llm_processor.py:** Contains functions for LLM processing (optional).
191
+ Modifying the Code
192
+
193
+ #### **Adding a New OCR Engine:**
194
+
195
+ Create a new function in ocr_engines.py to initialize and perform OCR with the new engine.
196
+ Update initialize_ocr_models and perform_ocr functions accordingly.
197
+
198
+ **Modifying LLM Functionality:**
199
+
200
+ Update llm_processor.py with new LLM models or processing methods.
201
+
202
+ **Disabling LLM Features:**
203
+
204
+ If you don't want to use LLM features, you don't need to install ollama.
205
+ The application will automatically disable LLM features if ollama is not installed.
206
+
207
+ ## Troubleshooting
208
+ **Import Errors:**
209
+
210
+ If you encounter import errors, ensure that all required packages are installed.
211
+ For optional features (like LLM), missing packages will disable those features without affecting the rest of the application.
212
+
213
+ **Tesseract Not Found:**
214
+
215
+ Ensure that the Tesseract executable path is correctly set in ocr_engines.py.
216
+ Verify that Tesseract is installed and the path is correct.
217
+
218
+ **Language Data Missing:**
219
+
220
+ Install the necessary language data files for the OCR engines.
221
+ Contributing
222
+ Contributions are welcome! Please fork the repository and submit a pull request for any improvements or new features.
223
 
224
+ ### License
225
+ This project is licensed under the **MIT** License.
226
+ # OCR
README_DEPLOY.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Deployment notes for Hugging Face Spaces
2
+
3
+ 1) HF_TOKEN secret
4
+ - Create a Hugging Face token at https://huggingface.co/settings/tokens
5
+ - Token should have repository write permissions (to create and push Spaces)
6
+ - In GitHub, go to Settings -> Secrets -> Actions -> New repository secret
7
+ - Name: HF_TOKEN
8
+ - Value: <your_token_here>
9
+
10
+ 2) Streamlit compatibility
11
+ - The workflow creates the Space with `space_sdk='streamlit'` so it will run as a Streamlit app.
12
+ - Hugging Face Spaces will run `streamlit_app.py` or `app.py` by default; this repo contains `streamlit_app.py` to be explicit.
13
+
14
+ 3) System dependencies
15
+ - Some OCR engines require system packages (e.g., Tesseract binary, system libs for PaddlePaddle). Hugging Face's Streamlit SDK does not allow installing system packages.
16
+ - If you need system packages, use a Docker-based Space (set `space_sdk='docker'` and add a Dockerfile that installs required system packages).
17
+
18
+ 4) LLM / Ollama
19
+ - The app optionally uses `ollama` for LLM features. Ollama is not installed by default in Spaces; LLM features will be disabled if `ollama` isn't present.
20
+
21
+ 5) Tesseract
22
+ - Ensure Tesseract is available in the environment or use the Docker approach to install it.
23
+
24
+ 6) Running CI/CD
25
+ - After pushing to `main` and setting `HF_TOKEN` secret, the GitHub Actions workflow `.github/workflows/deploy_to_hf.yml` will create the Space and upload the repository.
26
+
27
+ Note: This repository includes a `Dockerfile` and the CI workflow is configured to create a Docker-based Space (`space_sdk='docker'`). The Dockerfile installs system dependencies such as Tesseract so the OCR engines can run inside the Space container.
28
+
29
+ 7) Troubleshooting
30
+ - If the deployment fails, open the Actions run logs to see the error and adjust the workflow or repository accordingly.
app.py ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ from PIL import Image
3
+ import fitz # PyMuPDF
4
+ import numpy as np
5
+ import tempfile
6
+ import os
7
+ import time
8
+ import io
9
+ import json
10
+ import torch
11
+ import cv2
12
+
13
+ # Import OCR engines
14
+ import ocr_engines
15
+
16
+ # Try importing LLM processor if LLM features are to be used
17
+ llm_available = False
18
+ try:
19
+ import llm_processor
20
+
21
+ llm_available = True
22
+ except ImportError:
23
+ pass # LLM features will be disabled
24
+
25
+ # Create results folder if it doesn't exist
26
+ if not os.path.exists("results"):
27
+ os.makedirs("results")
28
+
29
+ # Streamlit application
30
+ st.title("OCRInsight")
31
+
32
+ # Sidebar
33
+ st.sidebar.header("Settings")
34
+
35
+
36
+ # Function to save text to file
37
+ def save_text_to_file(attributes_of_output, all_ocr_text, filename):
38
+ with open(filename, "a", encoding="utf-8") as f:
39
+ f.write("\n" + "-" * 75 + "\n")
40
+ f.write("Attributes of Output:\n")
41
+ f.write(attributes_of_output)
42
+ f.write("\nOCR Result:\n")
43
+ f.write(all_ocr_text)
44
+ f.write("\n" + "-" * 75 + "\n")
45
+ st.success(f"{filename} saved successfully!")
46
+
47
+
48
+ # Device selection
49
+ device = st.sidebar.radio("Select Device", ["CPU", "GPU (CUDA)"])
50
+ save_output = st.sidebar.checkbox("Save Outputs")
51
+
52
+ # Language selection
53
+ language = st.sidebar.selectbox(
54
+ "Select Language", ["Türkçe", "English", "Français", "Deutsch", "Español"]
55
+ )
56
+
57
+ # Map selected language to language codes
58
+ language_codes = {
59
+ "Türkçe": "tr",
60
+ "English": "en",
61
+ "Français": "fr",
62
+ "Deutsch": "de",
63
+ "Español": "es",
64
+ }
65
+
66
+ # OCR model selection
67
+ ocr_models = st.sidebar.multiselect(
68
+ "Select OCR Models",
69
+ ["EasyOCR", "DocTR", "Tesseract", "PaddleOCR"],
70
+ ["EasyOCR"], # default selection
71
+ )
72
+
73
+ # LLM model selection
74
+ llm_model = st.sidebar.selectbox(
75
+ "Select LLM Model", ["Only OCR Mode", "llama3.1", "llama3", "gemma2"]
76
+ )
77
+
78
+ # Conditional UI elements based on LLM model selection
79
+ if llm_model != "Only OCR Mode" and llm_available:
80
+ user_command = st.sidebar.text_input("Enter command:", "")
81
+
82
+ task_type = st.sidebar.radio("Select task type:", ["Summarize", "Generate"])
83
+ elif llm_model != "Only OCR Mode" and not llm_available:
84
+ st.sidebar.warning(
85
+ "LLM features are not available. Please install 'ollama' to enable LLM processing."
86
+ )
87
+ llm_model = "Only OCR Mode"
88
+
89
+ # Check GPU availability
90
+ if device == "GPU (CUDA)" and not torch.cuda.is_available():
91
+ st.sidebar.warning("GPU (CUDA) not available. Switching to CPU.")
92
+ device = "CPU"
93
+
94
+ # Initialize OCR models
95
+ ocr_readers = ocr_engines.initialize_ocr_models(
96
+ ocr_models, language_codes[language], device
97
+ )
98
+
99
+ # File upload
100
+ uploaded_file = st.file_uploader(
101
+ "Upload File (PDF, Image)", type=["pdf", "png", "jpg", "jpeg"]
102
+ )
103
+
104
+ # Create results folder if it doesn't exist
105
+ if not os.path.exists("results"):
106
+ os.makedirs("results")
107
+
108
+ if uploaded_file is not None:
109
+ start_time = time.time()
110
+
111
+ if uploaded_file.type == "application/pdf":
112
+ pdf_document = fitz.open(stream=uploaded_file.read(), filetype="pdf")
113
+ images = []
114
+ for page_num in range(len(pdf_document)):
115
+ page = pdf_document.load_page(page_num)
116
+ pix = page.get_pixmap()
117
+ img_data = pix.tobytes("png")
118
+ img = Image.open(io.BytesIO(img_data))
119
+ images.append(img)
120
+ total_pages = len(pdf_document)
121
+ pdf_document.close()
122
+ else:
123
+ images = [Image.open(uploaded_file)]
124
+ total_pages = 1
125
+
126
+ all_ocr_texts = {
127
+ model_name: "" for model_name in ocr_models
128
+ } # To store OCR text for each model
129
+
130
+ for page_num, image in enumerate(images, start=1):
131
+ st.image(image, caption=f"Page {page_num}/{total_pages}", use_column_width=True)
132
+
133
+ # Perform OCR with each selected model
134
+ for model_name in ocr_models:
135
+ text = ocr_engines.perform_ocr(
136
+ model_name, ocr_readers, image, language_codes[language]
137
+ )
138
+ all_ocr_texts[
139
+ model_name
140
+ ] += f"--- Page {page_num} ({model_name}) ---\n{text}\n\n"
141
+
142
+ st.subheader(f"OCR Result ({model_name}) - Page {page_num}/{total_pages}:")
143
+ st.text(text)
144
+
145
+ end_time = time.time()
146
+ process_time = end_time - start_time
147
+
148
+ st.info(f"Processing time: {process_time:.2f} seconds")
149
+
150
+ # Save OCR outputs if selected
151
+ if save_output:
152
+ attributes_of_output = {
153
+ "Model Names": ocr_models,
154
+ "Language": language,
155
+ "Device": device,
156
+ "Process Time": process_time,
157
+ }
158
+ for model_name, ocr_text in all_ocr_texts.items():
159
+ filename = f"results//ocr_output_{model_name}.txt"
160
+ save_text_to_file(
161
+ json.dumps(attributes_of_output, ensure_ascii=False), ocr_text, filename
162
+ )
163
+
164
+ # LLM processing
165
+ if (
166
+ llm_model != "Only OCR Mode"
167
+ and llm_available
168
+ and st.sidebar.button("Start LLM Processing")
169
+ ):
170
+ st.subheader("LLM Processing Result:")
171
+
172
+ # Combine all OCR texts
173
+ combined_ocr_text = "\n".join(all_ocr_texts.values())
174
+
175
+ # Prepare the prompt based on the task type
176
+ if task_type == "Summarize":
177
+ prompt = f"Please summarize the following text. Command: {user_command}\n\nText: {combined_ocr_text}"
178
+ else: # "Generate"
179
+ prompt = f"Please generate new text based on the following text. Command: {user_command}\n\nText: {combined_ocr_text}"
180
+
181
+ llm_output = llm_processor.process_with_llm(llm_model, prompt)
182
+
183
+ # Display the result
184
+ st.write(f"Processing completed using '{llm_model}' model.")
185
+ st.text_area("LLM Output:", value=llm_output, height=300)
186
+
187
+ # Save LLM output if selected
188
+ if save_output:
189
+ filename = "llm_output.txt"
190
+ save_text_to_file(llm_output, "", filename)
191
+
192
+ elif llm_model != "Only OCR Mode" and not llm_available:
193
+ st.warning(
194
+ "LLM features are not available. Please install 'ollama' to enable LLM processing."
195
+ )
196
+
197
+ st.sidebar.info(f"Selected device: {device}")
llm_processor.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # llm_processor.py
2
+
3
+ import ollama
4
+
5
+
6
+ def process_with_llm(llm_model, prompt):
7
+ response = ollama.chat(
8
+ model=llm_model,
9
+ messages=[
10
+ {
11
+ "role": "user",
12
+ "content": prompt,
13
+ },
14
+ ],
15
+ )
16
+ llm_output = response["message"]["content"]
17
+ return llm_output
ocr_engines.py ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """OCR engine initializers and runners with safer Tesseract handling."""
2
+
3
+ import os
4
+ import sys
5
+ import tempfile
6
+ import numpy as np
7
+
8
+ try:
9
+ import easyocr
10
+ except Exception:
11
+ easyocr = None
12
+
13
+ try:
14
+ from doctr.io import DocumentFile
15
+ from doctr.models import ocr_predictor
16
+ except Exception:
17
+ DocumentFile = None
18
+ ocr_predictor = None
19
+
20
+ try:
21
+ from paddleocr import PaddleOCR
22
+ except Exception:
23
+ PaddleOCR = None
24
+
25
+ try:
26
+ import pytesseract
27
+ except Exception:
28
+ pytesseract = None
29
+
30
+ try:
31
+ import cv2
32
+ except Exception:
33
+ cv2 = None
34
+
35
+
36
+ def initialize_ocr_models(ocr_models, language_code, device):
37
+ ocr_readers = {}
38
+
39
+ if "EasyOCR" in ocr_models and easyocr is not None:
40
+ ocr_readers["EasyOCR"] = easyocr.Reader(
41
+ [language_code], gpu=(device == "GPU (CUDA)")
42
+ )
43
+
44
+ if "DocTR" in ocr_models and ocr_predictor is not None:
45
+ ocr_readers["DocTR"] = ocr_predictor(pretrained=True)
46
+
47
+ if "PaddleOCR" in ocr_models and PaddleOCR is not None:
48
+ use_gpu = True if device == "GPU (CUDA)" else False
49
+ ocr_readers["PaddleOCR"] = PaddleOCR(lang=language_code, use_gpu=use_gpu)
50
+
51
+ # Tesseract: only set executable path for known Windows locations; on Unix, assume tesseract is on PATH
52
+ if "Tesseract" in ocr_models and pytesseract is not None:
53
+ if sys.platform.startswith("win"):
54
+ # common Windows installation path
55
+ pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
56
+ else:
57
+ # check common unix paths and set if tesseract binary exists there
58
+ for p in ("/usr/bin/tesseract", "/usr/local/bin/tesseract"):
59
+ if os.path.exists(p):
60
+ pytesseract.pytesseract.tesseract_cmd = p
61
+ break
62
+
63
+ return ocr_readers
64
+
65
+
66
+ def perform_ocr(model_name, ocr_readers, image, language_code):
67
+ text = ""
68
+
69
+ if model_name == "EasyOCR":
70
+ reader = ocr_readers.get("EasyOCR")
71
+ if reader is None:
72
+ return "[EasyOCR not available]"
73
+ result = reader.readtext(np.array(image))
74
+ text = "\n".join([res[1] for res in result])
75
+
76
+ elif model_name == "DocTR":
77
+ predictor = ocr_readers.get("DocTR")
78
+ if predictor is None or DocumentFile is None:
79
+ return "[DocTR not available]"
80
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".png") as tmp_file:
81
+ image.save(tmp_file, format="PNG")
82
+ file_path = tmp_file.name
83
+ doc = DocumentFile.from_images(file_path)
84
+ result = predictor(doc)
85
+ # Safely iterate pages/blocks
86
+ pages = []
87
+ for page in result.pages:
88
+ page_text_blocks = []
89
+ for block in page.blocks:
90
+ lines = [" ".join([word.value for word in line.words]) for line in block.lines]
91
+ page_text_blocks.append("\n".join(lines))
92
+ pages.append("\n\n".join(page_text_blocks))
93
+ text = "\n\n".join(pages)
94
+ try:
95
+ os.unlink(file_path)
96
+ except Exception:
97
+ pass
98
+
99
+ elif model_name == "PaddleOCR":
100
+ reader = ocr_readers.get("PaddleOCR")
101
+ if reader is None:
102
+ return "[PaddleOCR not available]"
103
+ result = reader.ocr(np.array(image))
104
+ # result may be empty or structured per line
105
+ try:
106
+ text = "\n".join([line[1][0] for line in result[0]])
107
+ except Exception:
108
+ # fallback: join any text tokens found
109
+ tokens = []
110
+ for page in result:
111
+ for line in page:
112
+ if len(line) > 1 and isinstance(line[1], (list, tuple)):
113
+ tokens.append(line[1][0])
114
+ text = "\n".join(tokens)
115
+
116
+ elif model_name == "Tesseract":
117
+ if pytesseract is None:
118
+ return "[pytesseract not available]"
119
+ # Convert PIL image to RGB if not already
120
+ try:
121
+ if image.mode != "RGB":
122
+ image = image.convert("RGB")
123
+ except Exception:
124
+ pass
125
+ # Convert image to OpenCV format if cv2 is available
126
+ if cv2 is not None:
127
+ opencv_image = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
128
+ else:
129
+ # fallback: use raw numpy array
130
+ opencv_image = np.array(image)
131
+ config = f"--oem 3 --psm 6 -l {language_code}"
132
+ try:
133
+ text = pytesseract.image_to_string(opencv_image) # , config=config
134
+ except Exception as e:
135
+ text = f"[Tesseract error: {e}]"
136
+
137
+ return text
requirement.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ streamlit
2
+ Pillow
3
+ PyMuPDF
4
+ numpy
5
+ torch
6
+ easyocr
7
+ python-doctr[torch]
8
+ paddlepaddle # For CPU; for GPU, specify the appropriate version
9
+ paddleocr
10
+ pytesseract
11
+ opencv-python
requirements.txt ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ streamlit
2
+ Pillow
3
+ PyMuPDF
4
+ numpy
5
+ torch
6
+ easyocr
7
+ python-doctr[torch]
8
+ paddlepaddle
9
+ paddleocr
10
+ pytesseract
11
+ opencv-python
results/ocr_output_DocTR.txt ADDED
@@ -0,0 +1,572 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---------------------------------------------------------------------------
3
+ Attributes of Output:
4
+ {"Model Names": ["DocTR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 1.2372400760650635}
5
+ OCR Result:
6
+ --- Page 1 (DocTR) ---
7
+ Genel olarak, sag eliniz ne kadar yi çalisti?
8
+ Sag parmaklanniz ne kadar iyi hareket etti?
9
+ Sag bileginiz ne kadar yi hareket etti?
10
+ Sag elinizin kuvveti nasildi?
11
+ Sag elinizde duyu (his) nasildi?
12
+
13
+
14
+ ---------------------------------------------------------------------------
15
+
16
+ ---------------------------------------------------------------------------
17
+ Attributes of Output:
18
+ {"Model Names": ["DocTR", "EasyOCR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 1.7959399223327637}
19
+ OCR Result:
20
+ --- Page 1 (DocTR) ---
21
+ Genel olarak, sag eliniz ne kadar yi çalisti?
22
+ Sag parmaklanniz ne kadar iyi hareket etti?
23
+ Sag bileginiz ne kadar yi hareket etti?
24
+ Sag elinizin kuvveti nasildi?
25
+ Sag elinizde duyu (his) nasildi?
26
+
27
+
28
+ ---------------------------------------------------------------------------
29
+
30
+ ---------------------------------------------------------------------------
31
+ Attributes of Output:
32
+ {"Model Names": ["DocTR", "EasyOCR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 1.8714756965637207}
33
+ OCR Result:
34
+ --- Page 1 (DocTR) ---
35
+ Genel olarak, sag eliniz ne kadar yi çalisti?
36
+ Sag parmaklanniz ne kadar iyi hareket etti?
37
+ Sag bileginiz ne kadar yi hareket etti?
38
+ Sag elinizin kuvveti nasildi?
39
+ Sag elinizde duyu (his) nasildi?
40
+
41
+
42
+ ---------------------------------------------------------------------------
43
+
44
+ ---------------------------------------------------------------------------
45
+ Attributes of Output:
46
+ {"Model Names": ["DocTR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 1.326887845993042}
47
+ OCR Result:
48
+ --- Page 1 (DocTR) ---
49
+ Genel olarak, sag eliniz ne kadar yi çalisti?
50
+ Sag parmaklanniz ne kadar iyi hareket etti?
51
+ Sag bileginiz ne kadar yi hareket etti?
52
+ Sag elinizin kuvveti nasildi?
53
+ Sag elinizde duyu (his) nasildi?
54
+
55
+
56
+ ---------------------------------------------------------------------------
57
+
58
+ ---------------------------------------------------------------------------
59
+ Attributes of Output:
60
+ {"Model Names": ["DocTR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 9.680048942565918}
61
+ OCR Result:
62
+ --- Page 1 (DocTR) ---
63
+ ALPEREN ÇELIK
64
+ - +90 5453851876 peradkisfotgmalcom im.com/m/Aperenell-791aits, 0 llh-com/Alperencls,
65
+ Education
66
+ Afyon Kocatepe University
67
+ Sep. 2018 - Jan 2024
68
+ Bachelor of Mechatronic Engineering
69
+ 3.1 gpa
70
+ Relevant Coursework
71
+ Artificial Intelligence
72
+ Software Methodology
73
+ Database Management
74
+ Internet Technology
75
+ Computer Vision
76
+ Algorithms Analysis
77
+ Data Structures
78
+ Systems Programming
79
+ Experience
80
+ Novelty AI
81
+ Sep 2023 - Present
82
+ AI/ML Engineer
83
+ Gebze, Turkiye
84
+ I developed dentification system software for a bank using deep learning techniques. This system aimed to increase
85
+ security by optimizing customer authentication processes and was successfully implemented.
86
+ At a defense industry company, I managed the installation of industrial robots (ABB) on the production line. In this
87
+ project, I provided complex automation solutions to integrate robotic: systems and increase operational efficiency-
88
+ For one of the leading telecom companies in Turkey, I developed software that enables live broadcasting and OTT
89
+ automation with Suitest software using Python and image processing techniques.
90
+ For a beverage company, I was part of the team that developed an artificial intelligence application that checks the
91
+ recognition and accuracy of product labels on the production line.
92
+ University of Malta
93
+ Jul 2023 - Aug 2023 (3 mos)
94
+ AI Researcher
95
+ Maita
96
+ * I worked on a sem-autonomous drone that tries to detect. waste pet bottles on beaches with artificial intelligence. I used
97
+ Python and C++ languages in this project
98
+ TC Diyanet isleri Bagkanhg
99
+ May 2019 - Jul 2022 (3 yrs 3 mos)
100
+ Civil Servant
101
+ Ayonkarhisar, Turkiye
102
+ While studying Mechatronics Engineering at the university, I worked as a civil servant and eammed a living. During these
103
+ three years, the most essential value that this job added to me was to develop myself discipline and determination in
104
+ order to keep the tough school and work life: in balance.
105
+ DHMI Erzurum Airport
106
+ Jul 2022 - Aug 2022 (2 mos)
107
+ Mechatronics Engineer Intern
108
+ Brzurum, Turkiye
109
+ * I worked as an intern in areas such as sensors, electronic cards, x-ray devices in terminal electronics. My most significant
110
+ gain from this internship was learning corporate work discipline and internal relationship techniques.
111
+ Ecodation
112
+ Jun 2021 - Jul 2021 (2 mos)
113
+ Python Developer Intern
114
+ Istanbul, Turkiye
115
+ I made projects such as navigation and customer tracking system for cargo delivery. My main achievement was learning
116
+ to work as a team.
117
+ Projects
118
+ Personalized Product Analysis with AI Python, Net, Huatei Cloud
119
+ Oct 2023
120
+ Our application is designed to help users make informed and healthy choices when purchasing products. By uploading a
121
+ photo of the product's ingredients, the user can get a detailed analysis of how suitable and beneficial the product is for
122
+ them. The application scans the content of the product with artificial intelligence systems and identifies substances that
123
+ may cause allergies or adverse effects that the user has previously dentified. It provides the user with a summary of the
124
+ product's content and the presence or absence of the substances they have identified. Thanks to this application, we
125
+ came 3rd in BIk Academy and Huawei Coding Marathon
126
+ Multi View Breast Cancer Classification App - Python, PyQs, Deep learming
127
+ Apr 2022
128
+ Within the scope of Teknofest artificial intelligence in health competition, the team I captained by developing a
129
+ multi-model deep learning network for the diagnosis of breast cancers succeeded in becoming a finalist.
130
+ Optical Character Recognition with Streamlit I Python, Streamlit, Huggingface
131
+ Jan 2024
132
+ OCR (Optical Character Recognition) technology has transformed how we interact with textual content in the digital
133
+ realm. By converting images, scanned documents, and other media into editable and searchable text, OCR enables us to
134
+ extract valuable information from diverse sources.
135
+
136
+ --- Page 2 (DocTR) ---
137
+ Technical Skills
138
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudio, Ros, PLC
139
+ Developer Tools: Tensorflow, Pytorch, Google Cloud Platform, Huawei Cloud
140
+ Technologies/Frameworks: Linux, GitHub, Selenium, Docker
141
+ Certificates
142
+ Teknofest Finalist Certificate:13 Foundation
143
+ BTK Academy And Huawei Coding Marathon Certificate of Competitio Winning (3rd)
144
+ EITCA Artificial Intelligence Academy 12 Certificates European Union
145
+ TensorFlow: Advanced Techniques Specialization Coursera
146
+ Google Cloud Expertise Google
147
+ AI Expert Training Program 6 months Republic Of Tirkiye Ministry of Industry and Technology
148
+ Introduction to Machine Learning in Production Coursera
149
+ Hands-on ROS Training with Python :Udemy
150
+ Image processing with deep learning :Udemy
151
+
152
+
153
+ ---------------------------------------------------------------------------
154
+
155
+ ---------------------------------------------------------------------------
156
+ Attributes of Output:
157
+ {"Model Names": ["DocTR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 9.149980068206787}
158
+ OCR Result:
159
+ --- Page 1 (DocTR) ---
160
+ ALPEREN ÇELIK
161
+ - +90 5453851876 peradkisfotgmalcom im.com/m/Aperenell-791aits, 0 llh-com/Alperencls,
162
+ Education
163
+ Afyon Kocatepe University
164
+ Sep. 2018 - Jan 2024
165
+ Bachelor of Mechatronic Engineering
166
+ 3.1 gpa
167
+ Relevant Coursework
168
+ Artificial Intelligence
169
+ Software Methodology
170
+ Database Management
171
+ Internet Technology
172
+ Computer Vision
173
+ Algorithms Analysis
174
+ Data Structures
175
+ Systems Programming
176
+ Experience
177
+ Novelty AI
178
+ Sep 2023 - Present
179
+ AI/ML Engineer
180
+ Gebze, Turkiye
181
+ I developed dentification system software for a bank using deep learning techniques. This system aimed to increase
182
+ security by optimizing customer authentication processes and was successfully implemented.
183
+ At a defense industry company, I managed the installation of industrial robots (ABB) on the production line. In this
184
+ project, I provided complex automation solutions to integrate robotic: systems and increase operational efficiency-
185
+ For one of the leading telecom companies in Turkey, I developed software that enables live broadcasting and OTT
186
+ automation with Suitest software using Python and image processing techniques.
187
+ For a beverage company, I was part of the team that developed an artificial intelligence application that checks the
188
+ recognition and accuracy of product labels on the production line.
189
+ University of Malta
190
+ Jul 2023 - Aug 2023 (3 mos)
191
+ AI Researcher
192
+ Maita
193
+ * I worked on a sem-autonomous drone that tries to detect. waste pet bottles on beaches with artificial intelligence. I used
194
+ Python and C++ languages in this project
195
+ TC Diyanet isleri Bagkanhg
196
+ May 2019 - Jul 2022 (3 yrs 3 mos)
197
+ Civil Servant
198
+ Ayonkarhisar, Turkiye
199
+ While studying Mechatronics Engineering at the university, I worked as a civil servant and eammed a living. During these
200
+ three years, the most essential value that this job added to me was to develop myself discipline and determination in
201
+ order to keep the tough school and work life: in balance.
202
+ DHMI Erzurum Airport
203
+ Jul 2022 - Aug 2022 (2 mos)
204
+ Mechatronics Engineer Intern
205
+ Brzurum, Turkiye
206
+ * I worked as an intern in areas such as sensors, electronic cards, x-ray devices in terminal electronics. My most significant
207
+ gain from this internship was learning corporate work discipline and internal relationship techniques.
208
+ Ecodation
209
+ Jun 2021 - Jul 2021 (2 mos)
210
+ Python Developer Intern
211
+ Istanbul, Turkiye
212
+ I made projects such as navigation and customer tracking system for cargo delivery. My main achievement was learning
213
+ to work as a team.
214
+ Projects
215
+ Personalized Product Analysis with AI Python, Net, Huatei Cloud
216
+ Oct 2023
217
+ Our application is designed to help users make informed and healthy choices when purchasing products. By uploading a
218
+ photo of the product's ingredients, the user can get a detailed analysis of how suitable and beneficial the product is for
219
+ them. The application scans the content of the product with artificial intelligence systems and identifies substances that
220
+ may cause allergies or adverse effects that the user has previously dentified. It provides the user with a summary of the
221
+ product's content and the presence or absence of the substances they have identified. Thanks to this application, we
222
+ came 3rd in BIk Academy and Huawei Coding Marathon
223
+ Multi View Breast Cancer Classification App - Python, PyQs, Deep learming
224
+ Apr 2022
225
+ Within the scope of Teknofest artificial intelligence in health competition, the team I captained by developing a
226
+ multi-model deep learning network for the diagnosis of breast cancers succeeded in becoming a finalist.
227
+ Optical Character Recognition with Streamlit I Python, Streamlit, Huggingface
228
+ Jan 2024
229
+ OCR (Optical Character Recognition) technology has transformed how we interact with textual content in the digital
230
+ realm. By converting images, scanned documents, and other media into editable and searchable text, OCR enables us to
231
+ extract valuable information from diverse sources.
232
+
233
+ --- Page 2 (DocTR) ---
234
+ Technical Skills
235
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudio, Ros, PLC
236
+ Developer Tools: Tensorflow, Pytorch, Google Cloud Platform, Huawei Cloud
237
+ Technologies/Frameworks: Linux, GitHub, Selenium, Docker
238
+ Certificates
239
+ Teknofest Finalist Certificate:13 Foundation
240
+ BTK Academy And Huawei Coding Marathon Certificate of Competitio Winning (3rd)
241
+ EITCA Artificial Intelligence Academy 12 Certificates European Union
242
+ TensorFlow: Advanced Techniques Specialization Coursera
243
+ Google Cloud Expertise Google
244
+ AI Expert Training Program 6 months Republic Of Tirkiye Ministry of Industry and Technology
245
+ Introduction to Machine Learning in Production Coursera
246
+ Hands-on ROS Training with Python :Udemy
247
+ Image processing with deep learning :Udemy
248
+
249
+
250
+ ---------------------------------------------------------------------------
251
+
252
+ ---------------------------------------------------------------------------
253
+ Attributes of Output:
254
+ {"Model Names": ["DocTR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 9.17273736000061}
255
+ OCR Result:
256
+ --- Page 1 (DocTR) ---
257
+ ALPEREN ÇELIK
258
+ - +90 5453851876 peradkisfotgmalcom im.com/m/Aperenell-791aits, 0 llh-com/Alperencls,
259
+ Education
260
+ Afyon Kocatepe University
261
+ Sep. 2018 - Jan 2024
262
+ Bachelor of Mechatronic Engineering
263
+ 3.1 gpa
264
+ Relevant Coursework
265
+ Artificial Intelligence
266
+ Software Methodology
267
+ Database Management
268
+ Internet Technology
269
+ Computer Vision
270
+ Algorithms Analysis
271
+ Data Structures
272
+ Systems Programming
273
+ Experience
274
+ Novelty AI
275
+ Sep 2023 - Present
276
+ AI/ML Engineer
277
+ Gebze, Turkiye
278
+ I developed dentification system software for a bank using deep learning techniques. This system aimed to increase
279
+ security by optimizing customer authentication processes and was successfully implemented.
280
+ At a defense industry company, I managed the installation of industrial robots (ABB) on the production line. In this
281
+ project, I provided complex automation solutions to integrate robotic: systems and increase operational efficiency-
282
+ For one of the leading telecom companies in Turkey, I developed software that enables live broadcasting and OTT
283
+ automation with Suitest software using Python and image processing techniques.
284
+ For a beverage company, I was part of the team that developed an artificial intelligence application that checks the
285
+ recognition and accuracy of product labels on the production line.
286
+ University of Malta
287
+ Jul 2023 - Aug 2023 (3 mos)
288
+ AI Researcher
289
+ Maita
290
+ * I worked on a sem-autonomous drone that tries to detect. waste pet bottles on beaches with artificial intelligence. I used
291
+ Python and C++ languages in this project
292
+ TC Diyanet isleri Bagkanhg
293
+ May 2019 - Jul 2022 (3 yrs 3 mos)
294
+ Civil Servant
295
+ Ayonkarhisar, Turkiye
296
+ While studying Mechatronics Engineering at the university, I worked as a civil servant and eammed a living. During these
297
+ three years, the most essential value that this job added to me was to develop myself discipline and determination in
298
+ order to keep the tough school and work life: in balance.
299
+ DHMI Erzurum Airport
300
+ Jul 2022 - Aug 2022 (2 mos)
301
+ Mechatronics Engineer Intern
302
+ Brzurum, Turkiye
303
+ * I worked as an intern in areas such as sensors, electronic cards, x-ray devices in terminal electronics. My most significant
304
+ gain from this internship was learning corporate work discipline and internal relationship techniques.
305
+ Ecodation
306
+ Jun 2021 - Jul 2021 (2 mos)
307
+ Python Developer Intern
308
+ Istanbul, Turkiye
309
+ I made projects such as navigation and customer tracking system for cargo delivery. My main achievement was learning
310
+ to work as a team.
311
+ Projects
312
+ Personalized Product Analysis with AI Python, Net, Huatei Cloud
313
+ Oct 2023
314
+ Our application is designed to help users make informed and healthy choices when purchasing products. By uploading a
315
+ photo of the product's ingredients, the user can get a detailed analysis of how suitable and beneficial the product is for
316
+ them. The application scans the content of the product with artificial intelligence systems and identifies substances that
317
+ may cause allergies or adverse effects that the user has previously dentified. It provides the user with a summary of the
318
+ product's content and the presence or absence of the substances they have identified. Thanks to this application, we
319
+ came 3rd in BIk Academy and Huawei Coding Marathon
320
+ Multi View Breast Cancer Classification App - Python, PyQs, Deep learming
321
+ Apr 2022
322
+ Within the scope of Teknofest artificial intelligence in health competition, the team I captained by developing a
323
+ multi-model deep learning network for the diagnosis of breast cancers succeeded in becoming a finalist.
324
+ Optical Character Recognition with Streamlit I Python, Streamlit, Huggingface
325
+ Jan 2024
326
+ OCR (Optical Character Recognition) technology has transformed how we interact with textual content in the digital
327
+ realm. By converting images, scanned documents, and other media into editable and searchable text, OCR enables us to
328
+ extract valuable information from diverse sources.
329
+
330
+ --- Page 2 (DocTR) ---
331
+ Technical Skills
332
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudio, Ros, PLC
333
+ Developer Tools: Tensorflow, Pytorch, Google Cloud Platform, Huawei Cloud
334
+ Technologies/Frameworks: Linux, GitHub, Selenium, Docker
335
+ Certificates
336
+ Teknofest Finalist Certificate:13 Foundation
337
+ BTK Academy And Huawei Coding Marathon Certificate of Competitio Winning (3rd)
338
+ EITCA Artificial Intelligence Academy 12 Certificates European Union
339
+ TensorFlow: Advanced Techniques Specialization Coursera
340
+ Google Cloud Expertise Google
341
+ AI Expert Training Program 6 months Republic Of Tirkiye Ministry of Industry and Technology
342
+ Introduction to Machine Learning in Production Coursera
343
+ Hands-on ROS Training with Python :Udemy
344
+ Image processing with deep learning :Udemy
345
+
346
+
347
+ ---------------------------------------------------------------------------
348
+
349
+ ---------------------------------------------------------------------------
350
+ Attributes of Output:
351
+ {"Model Names": ["DocTR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 9.556658029556274}
352
+ OCR Result:
353
+ --- Page 1 (DocTR) ---
354
+ ALPEREN ÇELIK
355
+ - +90 5453851876 peradkisfotgmalcom im.com/m/Aperenell-791aits, 0 llh-com/Alperencls,
356
+ Education
357
+ Afyon Kocatepe University
358
+ Sep. 2018 - Jan 2024
359
+ Bachelor of Mechatronic Engineering
360
+ 3.1 gpa
361
+ Relevant Coursework
362
+ Artificial Intelligence
363
+ Software Methodology
364
+ Database Management
365
+ Internet Technology
366
+ Computer Vision
367
+ Algorithms Analysis
368
+ Data Structures
369
+ Systems Programming
370
+ Experience
371
+ Novelty AI
372
+ Sep 2023 - Present
373
+ AI/ML Engineer
374
+ Gebze, Turkiye
375
+ I developed dentification system software for a bank using deep learning techniques. This system aimed to increase
376
+ security by optimizing customer authentication processes and was successfully implemented.
377
+ At a defense industry company, I managed the installation of industrial robots (ABB) on the production line. In this
378
+ project, I provided complex automation solutions to integrate robotic: systems and increase operational efficiency-
379
+ For one of the leading telecom companies in Turkey, I developed software that enables live broadcasting and OTT
380
+ automation with Suitest software using Python and image processing techniques.
381
+ For a beverage company, I was part of the team that developed an artificial intelligence application that checks the
382
+ recognition and accuracy of product labels on the production line.
383
+ University of Malta
384
+ Jul 2023 - Aug 2023 (3 mos)
385
+ AI Researcher
386
+ Maita
387
+ * I worked on a sem-autonomous drone that tries to detect. waste pet bottles on beaches with artificial intelligence. I used
388
+ Python and C++ languages in this project
389
+ TC Diyanet isleri Bagkanhg
390
+ May 2019 - Jul 2022 (3 yrs 3 mos)
391
+ Civil Servant
392
+ Ayonkarhisar, Turkiye
393
+ While studying Mechatronics Engineering at the university, I worked as a civil servant and eammed a living. During these
394
+ three years, the most essential value that this job added to me was to develop myself discipline and determination in
395
+ order to keep the tough school and work life: in balance.
396
+ DHMI Erzurum Airport
397
+ Jul 2022 - Aug 2022 (2 mos)
398
+ Mechatronics Engineer Intern
399
+ Brzurum, Turkiye
400
+ * I worked as an intern in areas such as sensors, electronic cards, x-ray devices in terminal electronics. My most significant
401
+ gain from this internship was learning corporate work discipline and internal relationship techniques.
402
+ Ecodation
403
+ Jun 2021 - Jul 2021 (2 mos)
404
+ Python Developer Intern
405
+ Istanbul, Turkiye
406
+ I made projects such as navigation and customer tracking system for cargo delivery. My main achievement was learning
407
+ to work as a team.
408
+ Projects
409
+ Personalized Product Analysis with AI Python, Net, Huatei Cloud
410
+ Oct 2023
411
+ Our application is designed to help users make informed and healthy choices when purchasing products. By uploading a
412
+ photo of the product's ingredients, the user can get a detailed analysis of how suitable and beneficial the product is for
413
+ them. The application scans the content of the product with artificial intelligence systems and identifies substances that
414
+ may cause allergies or adverse effects that the user has previously dentified. It provides the user with a summary of the
415
+ product's content and the presence or absence of the substances they have identified. Thanks to this application, we
416
+ came 3rd in BIk Academy and Huawei Coding Marathon
417
+ Multi View Breast Cancer Classification App - Python, PyQs, Deep learming
418
+ Apr 2022
419
+ Within the scope of Teknofest artificial intelligence in health competition, the team I captained by developing a
420
+ multi-model deep learning network for the diagnosis of breast cancers succeeded in becoming a finalist.
421
+ Optical Character Recognition with Streamlit I Python, Streamlit, Huggingface
422
+ Jan 2024
423
+ OCR (Optical Character Recognition) technology has transformed how we interact with textual content in the digital
424
+ realm. By converting images, scanned documents, and other media into editable and searchable text, OCR enables us to
425
+ extract valuable information from diverse sources.
426
+
427
+ --- Page 2 (DocTR) ---
428
+ Technical Skills
429
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudio, Ros, PLC
430
+ Developer Tools: Tensorflow, Pytorch, Google Cloud Platform, Huawei Cloud
431
+ Technologies/Frameworks: Linux, GitHub, Selenium, Docker
432
+ Certificates
433
+ Teknofest Finalist Certificate:13 Foundation
434
+ BTK Academy And Huawei Coding Marathon Certificate of Competitio Winning (3rd)
435
+ EITCA Artificial Intelligence Academy 12 Certificates European Union
436
+ TensorFlow: Advanced Techniques Specialization Coursera
437
+ Google Cloud Expertise Google
438
+ AI Expert Training Program 6 months Republic Of Tirkiye Ministry of Industry and Technology
439
+ Introduction to Machine Learning in Production Coursera
440
+ Hands-on ROS Training with Python :Udemy
441
+ Image processing with deep learning :Udemy
442
+
443
+
444
+ ---------------------------------------------------------------------------
445
+
446
+ ---------------------------------------------------------------------------
447
+ Attributes of Output:
448
+ {"Model Names": ["EasyOCR", "Tesseract", "DocTR"], "Language": "English", "Device": "CPU", "Process Time": 31.933929920196533}
449
+ OCR Result:
450
+ --- Page 1 (DocTR) ---
451
+ ALPEREN ÇELIK
452
+ - +90 5453851876 peradkisfotgmalcom im.com/m/Aperenell-791aits, 0 llh-com/Alperencls,
453
+ Education
454
+ Afyon Kocatepe University
455
+ Sep. 2018 - Jan 2024
456
+ Bachelor of Mechatronic Engineering
457
+ 3.1 gpa
458
+ Relevant Coursework
459
+ Artificial Intelligence
460
+ Software Methodology
461
+ Database Management
462
+ Internet Technology
463
+ Computer Vision
464
+ Algorithms Analysis
465
+ Data Structures
466
+ Systems Programming
467
+ Experience
468
+ Novelty AI
469
+ Sep 2023 - Present
470
+ AI/ML Engineer
471
+ Gebze, Turkiye
472
+ I developed dentification system software for a bank using deep learning techniques. This system aimed to increase
473
+ security by optimizing customer authentication processes and was successfully implemented.
474
+ At a defense industry company, I managed the installation of industrial robots (ABB) on the production line. In this
475
+ project, I provided complex automation solutions to integrate robotic: systems and increase operational efficiency-
476
+ For one of the leading telecom companies in Turkey, I developed software that enables live broadcasting and OTT
477
+ automation with Suitest software using Python and image processing techniques.
478
+ For a beverage company, I was part of the team that developed an artificial intelligence application that checks the
479
+ recognition and accuracy of product labels on the production line.
480
+ University of Malta
481
+ Jul 2023 - Aug 2023 (3 mos)
482
+ AI Researcher
483
+ Maita
484
+ * I worked on a sem-autonomous drone that tries to detect. waste pet bottles on beaches with artificial intelligence. I used
485
+ Python and C++ languages in this project
486
+ TC Diyanet isleri Bagkanhg
487
+ May 2019 - Jul 2022 (3 yrs 3 mos)
488
+ Civil Servant
489
+ Ayonkarhisar, Turkiye
490
+ While studying Mechatronics Engineering at the university, I worked as a civil servant and eammed a living. During these
491
+ three years, the most essential value that this job added to me was to develop myself discipline and determination in
492
+ order to keep the tough school and work life: in balance.
493
+ DHMI Erzurum Airport
494
+ Jul 2022 - Aug 2022 (2 mos)
495
+ Mechatronics Engineer Intern
496
+ Brzurum, Turkiye
497
+ * I worked as an intern in areas such as sensors, electronic cards, x-ray devices in terminal electronics. My most significant
498
+ gain from this internship was learning corporate work discipline and internal relationship techniques.
499
+ Ecodation
500
+ Jun 2021 - Jul 2021 (2 mos)
501
+ Python Developer Intern
502
+ Istanbul, Turkiye
503
+ I made projects such as navigation and customer tracking system for cargo delivery. My main achievement was learning
504
+ to work as a team.
505
+ Projects
506
+ Personalized Product Analysis with AI Python, Net, Huatei Cloud
507
+ Oct 2023
508
+ Our application is designed to help users make informed and healthy choices when purchasing products. By uploading a
509
+ photo of the product's ingredients, the user can get a detailed analysis of how suitable and beneficial the product is for
510
+ them. The application scans the content of the product with artificial intelligence systems and identifies substances that
511
+ may cause allergies or adverse effects that the user has previously dentified. It provides the user with a summary of the
512
+ product's content and the presence or absence of the substances they have identified. Thanks to this application, we
513
+ came 3rd in BIk Academy and Huawei Coding Marathon
514
+ Multi View Breast Cancer Classification App - Python, PyQs, Deep learming
515
+ Apr 2022
516
+ Within the scope of Teknofest artificial intelligence in health competition, the team I captained by developing a
517
+ multi-model deep learning network for the diagnosis of breast cancers succeeded in becoming a finalist.
518
+ Optical Character Recognition with Streamlit I Python, Streamlit, Huggingface
519
+ Jan 2024
520
+ OCR (Optical Character Recognition) technology has transformed how we interact with textual content in the digital
521
+ realm. By converting images, scanned documents, and other media into editable and searchable text, OCR enables us to
522
+ extract valuable information from diverse sources.
523
+
524
+ --- Page 2 (DocTR) ---
525
+ Technical Skills
526
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudio, Ros, PLC
527
+ Developer Tools: Tensorflow, Pytorch, Google Cloud Platform, Huawei Cloud
528
+ Technologies/Frameworks: Linux, GitHub, Selenium, Docker
529
+ Certificates
530
+ Teknofest Finalist Certificate:13 Foundation
531
+ BTK Academy And Huawei Coding Marathon Certificate of Competitio Winning (3rd)
532
+ EITCA Artificial Intelligence Academy 12 Certificates European Union
533
+ TensorFlow: Advanced Techniques Specialization Coursera
534
+ Google Cloud Expertise Google
535
+ AI Expert Training Program 6 months Republic Of Tirkiye Ministry of Industry and Technology
536
+ Introduction to Machine Learning in Production Coursera
537
+ Hands-on ROS Training with Python :Udemy
538
+ Image processing with deep learning :Udemy
539
+
540
+
541
+ ---------------------------------------------------------------------------
542
+
543
+ ---------------------------------------------------------------------------
544
+ Attributes of Output:
545
+ {"Model Names": ["EasyOCR", "DocTR", "Tesseract", "PaddleOCR"], "Language": "English", "Device": "CPU", "Process Time": 22.685651779174805}
546
+ OCR Result:
547
+ --- Page 1 (DocTR) ---
548
+ RUNNING. Stop Deploy :
549
+ Settings
550
+ Select Device
551
+ OCR and LLM Application
552
+ e CPU
553
+ O GPU (CUDA)
554
+ Upload File (PDF, Image)
555
+ V Save Outputs
556
+ Drag and drop filel here
557
+ Browse files
558
+ Limit 200MB pert file-PDF,PNG, JPG, JPEG
559
+ Selectlanguage
560
+ English
561
+ Select OCR Models
562
+ EasyOCR X DOcTR x
563
+ Tesseract x PaddleOcR
564
+ SelectLLMN Model
565
+ llama3.1
566
+ Enter command:
567
+ Selecttaskt type:
568
+ e Summarize
569
+ Generate
570
+
571
+
572
+ ---------------------------------------------------------------------------
results/ocr_output_EasyOCR.txt ADDED
@@ -0,0 +1,405 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---------------------------------------------------------------------------
3
+ Attributes of Output:
4
+ {"Model Names": ["EasyOCR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 0.6307122707366943}
5
+ OCR Result:
6
+ --- Page 1 (EasyOCR) ---
7
+ Genel olarak sağ elinlz ne kadar iyl çalıştı?
8
+ Sağ parmaklannız ne kadar iyi hareket etti?
9
+ Sağ bileğiniz ne kadar iyi hareket etti?
10
+ Sağ elinizin kuvveti nasıldı?
11
+ Sağ elinizde
12
+ (his) nasıldı?
13
+ duyu
14
+
15
+
16
+ ---------------------------------------------------------------------------
17
+
18
+ ---------------------------------------------------------------------------
19
+ Attributes of Output:
20
+ {"Model Names": ["DocTR", "EasyOCR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 1.7959399223327637}
21
+ OCR Result:
22
+ --- Page 1 (EasyOCR) ---
23
+ Genel olarak sağ elinlz ne kadar iyl çalıştı?
24
+ Sağ parmaklannız ne kadar iyi hareket etti?
25
+ Sağ bileğiniz ne kadar iyi hareket etti?
26
+ Sağ elinizin kuvveti nasıldı?
27
+ Sağ elinizde
28
+ (his) nasıldı?
29
+ duyu
30
+
31
+
32
+ ---------------------------------------------------------------------------
33
+
34
+ ---------------------------------------------------------------------------
35
+ Attributes of Output:
36
+ {"Model Names": ["DocTR", "EasyOCR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 1.8714756965637207}
37
+ OCR Result:
38
+ --- Page 1 (EasyOCR) ---
39
+ Genel olarak sağ elinlz ne kadar iyl çalıştı?
40
+ Sağ parmaklannız ne kadar iyi hareket etti?
41
+ Sağ bileğiniz ne kadar iyi hareket etti?
42
+ Sağ elinizin kuvveti nasıldı?
43
+ Sağ elinizde
44
+ (his) nasıldı?
45
+ duyu
46
+
47
+
48
+ ---------------------------------------------------------------------------
49
+
50
+ ---------------------------------------------------------------------------
51
+ Attributes of Output:
52
+ {"Model Names": ["EasyOCR", "Tesseract", "DocTR"], "Language": "English", "Device": "CPU", "Process Time": 31.933929920196533}
53
+ OCR Result:
54
+ --- Page 1 (EasyOCR) ---
55
+ ALPEREN CELIK
56
+ +yu ,isssisiu
57
+ AerlS7Gtuicut
58
+ AukELLOlllIW/Aluteu-celk-/414S1G
59
+ Kthu CUt
60
+ Aluerenclk /
61
+ Educanou
62
+ Afyon Kocatepe
63
+ Universitu
64
+ 2U18
65
+ Jan '2U244
66
+ Becd-
67
+ Wce
68
+ EngmetinS
69
+ ueau
70
+ Cuursewors
71
+ Lruncl
72
+ Intelligence
73
+ Sulttitt
74
+ Methextlolugy
75
+ DAlLc
76
+ MatagCMLEut
77
+ JcA
78
+ Tecltolxy
79
+ Cotupute
80
+ Vsiol
81
+ Algurithts
82
+ AMNI:
83
+ Dal Struceu
84
+ XSLett
85
+ PozHAing
86
+ Experience
87
+ Novelty
88
+ Sep 2028
89
+ Preemt
90
+ AI{ML Engiutt'
91
+ Gluzl _
92
+ TuT(Jut
93
+ veuc
94
+ Ict ILLLcACLUM fsem waEe
95
+ USILE deek
96
+ Jeautniug
97
+ teciutles
98
+ s * SLIIL MITL
99
+ cELS
100
+ JCurInI=
101
+ #pLIMAZI= custumer authiettcatioln
102
+ UEuC3s; Au Mi suclsslul;
103
+ JHUeteteu
104
+ l
105
+ MT
106
+ u
107
+ Jalaged tle
108
+ MsI;cM
109
+ Faol
110
+ (ABB
111
+ tle productict Iitje _
112
+ ptueci
113
+ Doudrdouulex #uouabclnauutiuts
114
+ E_HIC rulol I
115
+ ALC MCEURC UPXCLUM]
116
+ eflicietcy-
117
+ Lalue
118
+ CUELAAELC
119
+ uk
120
+ eveloue
121
+ (at QHable=
122
+ L; UEuiLSML A U
123
+ MLUHAOU
124
+ Suiest
125
+ ~utwit USlt= Puthutt
126
+ IEMc DolFML [Lclelacc
127
+ 0Jikl
128
+ UELAALY_
129
+ develope
130
+ ncllIM aa
131
+ applicatin that checks
132
+ MuuM
133
+ Muaa
134
+ mukluet lalels on the
135
+ prouluction Iine .
136
+ Universitv
137
+ Malta
138
+ Jul 2u23
139
+ AuS 2023 (3 mos)
140
+ Rec
141
+ Multu
142
+ WuLe
143
+ FM umR Ca
144
+ K[
145
+ pet hottl:
146
+ a
147
+ Jck #ufc
148
+ inte Illgetee .
149
+ FBLI
150
+ Fatl
151
+ JitgMiyes
152
+ prujec
153
+ TC Diyanet isleri Daskanlgi
154
+ Ni
155
+ ZUlJ
156
+ Jul 2022 (3 yra
157
+ MJOSI
158
+ Servuf (
159
+ Auun UtJisu _
160
+ TuT(Jut
161
+ Wle STI]VL _eclulez-
162
+ Eueeg
163
+ MEOI;
164
+ Wuke
165
+ C ;CY[
166
+ MVILE.
167
+ 113t
168
+ Iiee tS_
169
+ value that tils juls akdledl
170
+ derJur [TAC dSCAEML"
171
+ tleter uat IclL
172
+ ut)
173
+ Illt (ulleli *ol
174
+ wutk Iit
175
+ Dalce
176
+ DHMI
177
+ Erzunun
178
+ Airport
179
+ Jul 2U22
180
+ 2022
181
+ MORI
182
+ Auchutiue:
183
+ Exeeet
184
+ Iuter
185
+ Erzu 4J{A
186
+ TuT(Jut
187
+ uckl
188
+ ueru
189
+ irLi
190
+ ulSUs
191
+ electrule LE
192
+ X-t devices
193
+ teHinal electronic My
194
+ SeuicMHIC
195
+ Ll IFUl
196
+ [LS LcAMP %iatMl ColAlC FUk 4EWLC AEla
197
+ IELTe ICAtlMaIL [CCMLUYR
198
+ codarion
199
+ 2021
200
+ Jul "Z
201
+ Pedlu
202
+ De
203
+ K
204
+ Ivux
205
+ Tutiye
206
+ cl
207
+ MuCd
208
+ TTLAl HL CU:[eE
209
+ TACL >F1CM
210
+ cargo delivery: My Wain achlereJleut #als
211
+ Iemttg
212
+ u
213
+ M
214
+ Frojects
215
+ Peraqualked
216
+ Pruduc
217
+ Analysis wich AI
218
+ Puk
219
+ Net.
220
+ Hnct Clatu
221
+ 203
222
+ #pplication
223
+ ataiuleu
224
+ Qel USeI:
225
+ Wfurtual
226
+ M lt
227
+ M
228
+ Ge
229
+ MECILLAJIY Decis
230
+ By uplonaling #
231
+ pelu
232
+ tle potluct"
233
+ iuguexlicuts,
234
+ Uset GL 2el
235
+ daraily #uales
236
+ how" ?uitable aud Lxuefielal te petluct
237
+ m
238
+ The Applcaclon
239
+ tlie [uuluet with Atificial
240
+ intelllgenc:
241
+ Mdemtln ? aulritce=
242
+ C
243
+ alletzie?
244
+ e em
245
+ Huuea
246
+ Hleucife
247
+ Doudas
248
+ M MaT
249
+ Rui
250
+ product
251
+ T ' UFaa
252
+ WR
253
+ Liamuan
254
+ Tlaul;
255
+ thls applcaclon
256
+ BTk Aeuety"
257
+ Huaw ej Cucllug Marathon
258
+ Mulce
259
+ Vew
260
+ Breast Cancer
261
+ Clasalficatiun
262
+ Pgthon. PyQtss
263
+ Deep Jetc
264
+ Apr 2022
265
+ FtiL Ti' :UM"
266
+ Tektulest artlcl] mtelluence
267
+ lelth coletitull_
268
+ CLiIL
269
+ GuLilLd UY
270
+ develjing
271
+ Juulti-uudel deep leatuiug uetwak
272
+ tlie dmeISI
273
+ brenst Caucets sucurdtL
274
+ MeLLI"
275
+ HLA]st
276
+ Uutical
277
+ Charaeter Recoguition
278
+ Aa
279
+ Stremlit
280
+ Pvn
281
+ Staxurtt.
282
+ Hadwefuc
283
+ 202a
284
+ OCR
285
+ Opeleal Characte
286
+ Recuutil| [CCMUluy Mis Iriledoned Mou
287
+ teruc WIII textual cutett
288
+ tle digital
289
+ Jealu By cutvelting itage>
290
+ TmMtL
291
+ LUCIILICM3
292
+ iLCumer MeU To €taule
293
+ srclile tex_
294
+ OCK e"alile?
295
+ extract Wilu:LH
296
+ MLLUlion
297
+ ht dnvere auuce
298
+ Sep:
299
+ TeettLA
300
+ coyet"
301
+ JA
302
+ Dug
303
+ AS Wl
304
+ 4u
305
+ 4ug
306
+ kh
307
+ 404u
308
+ uu
309
+
310
+ --- Page 2 (EasyOCR) ---
311
+ Technical Skills
312
+ Languages:
313
+ Fytlon €} C++-
314
+ Matlal SQL.
315
+ RukuStudl
316
+ FLC
317
+ Developer
318
+ Tuula:
319
+ Taue #lue
320
+ Pytorch Guogl Clouel Flatfottu
321
+ Huawel Cludl
322
+ Techuologies
323
+ autneworke
324
+ Linux . GitHub. Selen_
325
+ W
326
+ Certilicates
327
+ Tekuuleer
328
+ Fluillat
329
+ Cercileate T; Fouulaticn
330
+ BTK
331
+ ChdT
332
+ eae
333
+ Codlug
334
+ Martlou
335
+ Cetfc
336
+ Cmpetitic Winuing (cd)
337
+ EITCA
338
+ Artaclal
339
+ Intelligeuce
340
+ ACha
341
+ Certncates
342
+ :Euro[an
343
+ Uulu
344
+ TeugorFlor
345
+ Laten
346
+ Teclulane
347
+ Specializatlon
348
+ Cuueari
349
+ Google
350
+ Cloud
351
+ Expertise Guogle
352
+ Expert Traluing Program
353
+ Womne
354
+ Republic OF Tikiye Minisuty
355
+ 'Indlustiy
356
+ Tehology
357
+ Mr u
358
+ Nacke
359
+ Learmg
360
+ Productou
361
+ Cuuer
362
+ Hauds-On ROS Tralniug
363
+ Eitk
364
+ Python :Udletny
365
+ Itage processlug
366
+ wtk dcer
367
+ learuiug : Udlemy
368
+
369
+
370
+ ---------------------------------------------------------------------------
371
+
372
+ ---------------------------------------------------------------------------
373
+ Attributes of Output:
374
+ {"Model Names": ["EasyOCR", "DocTR", "Tesseract", "PaddleOCR"], "Language": "English", "Device": "CPU", "Process Time": 22.685651779174805}
375
+ OCR Result:
376
+ --- Page 1 (EasyOCR) ---
377
+ RUNNING
378
+ Stop
379
+ Deploy
380
+ Settings
381
+ Select Device
382
+ OCR and LLM Application
383
+ CPU
384
+ GPU (CUDA)
385
+ Upload File (PDF; Image)
386
+ Drag and drop file here
387
+ Save Outputs
388
+ Browse files
389
+ Limit ZOOMB per file : PDF; PNG; JPG, JPEG
390
+ Select Language
391
+ English
392
+ Select OCR Models
393
+ EasyOCR
394
+ DocTR
395
+ Tesseract
396
+ PaddleOCR
397
+ Select LLM Model
398
+ Ilama3.1
399
+ Enter command:
400
+ Select task type:
401
+ Summarize
402
+ Generate
403
+
404
+
405
+ ---------------------------------------------------------------------------
results/ocr_output_PaddleOCR.txt ADDED
@@ -0,0 +1,259 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---------------------------------------------------------------------------
3
+ Attributes of Output:
4
+ {"Model Names": ["PaddleOCR"], "Language": "English", "Device": "CPU", "Process Time": 13.193224430084229}
5
+ OCR Result:
6
+ --- Page 1 (PaddleOCR) ---
7
+ ALPEREN
8
+ CELIK
9
+ +90 5453851876
10
+ alperenclk18760gmail.cominlinkedin.com/in/alperen-celik-7919a5163/
11
+ github.com/Alperenclk/
12
+ Education
13
+ Afyon Kocatepe University
14
+ Sep. 2018 Jan 2024
15
+ Bachelor of Mechatronic Engineering
16
+ 3.1 gpa
17
+ Relevant Coursework
18
+ Artificial Intelligence
19
+ Software Methodology
20
+ Database Management
21
+ Internet Technology
22
+ Computer Vision
23
+ Algorithms Analysis
24
+ Data Structures
25
+ Systems Programming
26
+ Experience
27
+ Novelty AI
28
+ Sep 2023 Present
29
+ AI/ML Engineer
30
+ Gebze, Turkiye
31
+ I developed identification system software for a bank using deep learning techniques. This system aimed to increase
32
+ security by optimizing customer authentication processes and was successfully implemented.
33
+ At a defense industry company, I managed the installation of industrial robots (ABB) on the production line. In this
34
+ project, I provided complex automation solutions to integrate robotic systems and increase operational efficiency.
35
+ For one of the leading telecom companies in Turkey, I developed software that enables live broadcasting and OTT
36
+ automation with Suitest software using Python and image processing techniques.
37
+ For a beverage company, I was part of the team that developed an artificial intelligence application that checks the
38
+ recognition and accuracy of product labels on the production line.
39
+ University of Malta
40
+ Jul 2023 Aug 2023 (3 mos)
41
+ AI Researcher
42
+ Malt
43
+ I worked on a semi-autonomous drone that tries to detect waste pet bottles on beaches with artificial intelligence. I used
44
+ Python and C++ languages in this project
45
+ TC Diyanet Isleri Baskanlig1
46
+ May 2019 Jul 2022 (3 yrs 3 mos)
47
+ Civil Servant
48
+ Afyonkarhisar, Turkiye
49
+ While studying Mechatronics Engineering at the university, I worked as a civil servant and earned a living. During these
50
+ three years, the most essential value that this job added to me was to develop myself discipline and determination in
51
+ order to keep the tough school and work life in balance.
52
+ DHMI Erzurum Airport
53
+ Jul 2022 Aug 2022 (2 mos)
54
+ Mechatronics Engineer Intern
55
+ Erzurum, Turkiye
56
+ I worked as an intern in areas such as sensors, electronic cards, x-ray devices in terminal electronics. My most significant
57
+ gain from this internship was learning corporate work discipline and internal relationship techniques.
58
+ Ecodation
59
+ Jun 2021 Jul 2021 (2 mos)
60
+ Python Developer Intern
61
+ Istanbul, Turkiye
62
+ I made projects such as navigation and customer tracking system for cargo delivery. My main achievement was learning
63
+ to work as a team.
64
+ Projects
65
+ Personalized Product Analysis with AI | Python, .Net, Huawei Cloud
66
+ Oct 2023
67
+ Our application is designed to help users make informed and healthy choices when purchasing products. By uploading a
68
+ photo of the product's ingredients, the user can get a detailed analysis of how suitable and beneficial the product is for
69
+ them. The application scans the content of the product with artificial intelligence systems and identifies substances that
70
+ may cause allergies or adverse effects that the user has previously identified. It provides the user with a summary of the
71
+ product's content and the presence or absence of the substances they have identified. Thanks to this application, we
72
+ came 3rd in BTk Academy and Huawei Coding Marathon
73
+ Multi View Breast Cancer Classification App | Python, PyQt5, Deep learning
74
+ Apr 2022
75
+ Within the scope of Teknofest artificial intelligence in health competition, the team I captained by developing a
76
+ multi-model deep learning network for the diagnosis of breast cancers succeeded in becoming a finalist.
77
+ Optical Character Recognition with Streamlit | Python, Streamlit, Huggingface
78
+ Jan 2024
79
+ OCR (Optical Character Recognition) technology has transformed how we interact with textual content in the digital
80
+ realm. By converting images, scanned documents, and other media into editable and searchable text, OCR enables us to
81
+ extract valuable information from diverse sources.
82
+
83
+ --- Page 2 (PaddleOCR) ---
84
+ Technical Skills
85
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudio, Ros, PLC
86
+ Developer Tools: Tensorflow, Pytorch, Google Cloud Platform, Huawei Cloud
87
+ Technologies/Frameworks: Linux, GitHub, Selenium, Docker
88
+ Certificates
89
+ Teknofest Finalist Certificate:T3 Foundation
90
+ BTK Academy And Huawei Coding Marathon :Certificate of Competitio Winning (3rd)
91
+ EITCA Artificial Intelligence Academy 12 Certificates :European Union
92
+ TensorFlow: Advanced Techniques Specialization :Coursera
93
+ Google Cloud Expertise :Google
94
+ AI Expert Training Program 6 months :Republic Of Tirkiye Ministry of Industry and Technology
95
+ Introduction to Machine Learning in Production :Coursera
96
+ Hands-on ROS Training with Python :Udemy
97
+ Image processing with deep learning :Udemy
98
+
99
+
100
+ ---------------------------------------------------------------------------
101
+
102
+ ---------------------------------------------------------------------------
103
+ Attributes of Output:
104
+ {"Model Names": ["PaddleOCR"], "Language": "Türkçe", "Device": "CPU", "Process Time": 6.291873216629028}
105
+ OCR Result:
106
+ --- Page 1 (PaddleOCR) ---
107
+ ALPEREN QELIK
108
+ +90 5453851876
109
+ eng.alperengmail.com
110
+ in linkedin.com/in/alperen-celik-7919a5163/
111
+ github.com/Alperenclk/
112
+ Education
113
+ Afyon Kocatepe University
114
+ Sep. 2018 Jan 2024
115
+ Bachelor of Mechatronic Engincering
116
+ 3.1 gpa
117
+ Relevant Coursework
118
+ Artificial Intelligence
119
+ Robotics
120
+ Machine Learning
121
+ Computer Vision
122
+ Cloud Systems
123
+ Deep Learning
124
+ NLP
125
+ Database Management
126
+ Experience
127
+ Novelty AI
128
+ Sep 2023 Present
129
+ AI/ML Engincer
130
+ GebzcTurkiye
131
+ Using advanced Computer Vision and deep learning techniques, I have developed a system that develops billboards that
132
+ provide personalized advertising by analyzing the characteristics of customers who come to stores in shopping malls. This
133
+ system works 15% more successfully than the foreign product previously used.
134
+ Developed C# software using profile sensors from Sick and Venglor to control parts production for a Japan-based factory
135
+ With this software, we used advanced image processing techniques to process and analyze point cloud data to accurately
136
+ monitor the production process and detect defective parts, increasing the success rate of production to 95%.°
137
+ I developed identification system software for a bank using deep learning techniques. This system aimed to increase
138
+ security by optimizing customer authentication processes and was successfully implemented.
139
+ _At a defense industry company, I led the installation of ABB industrial robots on a production line. Leveraging advanced
140
+ Computer Vision techniques, I developed a system that commands robots to perform tasks traditionally performed by
141
+ human operators. This project used Artificial Intelligence and robotics to deliver complex automation solutions, greatly
142
+ improving operational efficiency.
143
+ For one of the leading telecom companies in Turkey, I developed software that enables live broadcasting and OTT
144
+ automation with Suitest software using Python and image processing techniques.
145
+ For a beverage company, I was part of the team that developed an artificial intelligence application that checks the
146
+ recognition and accuracy of product labels on the production line.
147
+ University of Malta
148
+ Jul 2023 Aug 2023 (3 mos)
149
+ Al Rescarcher
150
+ Malta
151
+ I worked with a team developing a semi-autonomous drone to detect waste plastic bottles on beaches using artificial
152
+ intelligence. I implemented computer vision algorithms such as Detectron2 in Python for object detection and used C++
153
+ for real-time integration with the drone's control systems, enabling efficient environmental scanning and cleaning.
154
+ DHMI Erzurum Airport
155
+ Jul 2022 Aug 2022 (2 mos)
156
+ Mechatronics Engincer Intern
157
+ Erzurum, Tirkiye
158
+ I interned in areas such as sensors, electronic cards, and x-ray devices within terminal electronics. The most significant
159
+ gain from this internship was learning corporate work discipline and internal relationship techniques..
160
+ Ecodation
161
+ Jun 2021 Jul 2021 (2 mos)
162
+ Python Developer Intern
163
+ Istanul,Trkiy
164
+ I worked on projects like a navigation and customer tracking system for cargo delivery. These projects involved
165
+ developing and integrating Python APIs and Flask to streamline data communication and using web scraping techniques
166
+ to gather and analyze relevant information from various sources. My main achievement from these projects was learning
167
+ to work effectively as a team, coordinating with colleagues to tackle complex challenges and deliver cohesive solutions.
168
+
169
+ --- Page 2 (PaddleOCR) ---
170
+ Projects
171
+ AI-Driven Cryptocurrency Trading Bot Python, LLM, LangChain, AI Agents
172
+ Developed a cryptocurrency trading bot utilizing Large Language Models (LLMs) and LangChain to enhance trading
173
+ decisions. The bot integrates with the Binance API, executing trades based on real-time market data and technical
174
+ analysis.
175
+ The LangChain framework and AI agents are used to analyze market trends by combining historical data with current
176
+ information. This setup enables the bot to perform detailed technical analysis and generate actionable insights for
177
+ trading.
178
+ AI agents process news and social media to assess market sentiment. The bot adjusts its trading strategies based on
179
+ sentiment and technical indicators, aiming to maximize profitability and minimize risk.
180
+ Python was used for developing the core functionalities and integrating the LangChain system, ensuring the bot's
181
+ efficiency and effectiveness in the volatile cryptocurrency market. The bot also adapts and improves its strategies over
182
+ time through continuous learning.
183
+ Personalized Product Analysis with AI Python, .Net, Huauei Cloud
184
+ Our application is designed to help users make informed and healthy choices when purchasing products. By uploading a
185
+ photo of the product's ingredients, the user can get a detailed analysis of how suitable and beneficial the product is for
186
+ them. The application scans the content of the product with artificial intelligence systems and identifies substances that
187
+ may cause allergies or adverse effects that the user has previously identified. It provides the user with a summary of the
188
+ product's content and the presence or absence of the substances they have identified. Thanks to this application, we
189
+ came 3rd in BTk Academy and Huawei Coding Marathon
190
+ Multi View Breast Cancer Classification App Python, PyQt5, Deep Iearning
191
+ Within the scope of Teknofest artificial intelligence in health competition, the team I captained by developing a
192
+ multi-model deep learning network for the diagnosis of breast cancers succeeded in becoming a finalist.
193
+ Optical Character Recognition with Streamlit Python, Streamlit, Huggingface
194
+ _OCR (Optical Character Recognition) technology has transformed how we interact with textual content in the digital
195
+ realm. By converting images, scanned documents, and other media into editable and searchable text, OCR enables us to
196
+ extract valuable information from diverse sources.
197
+ Technical Skills
198
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudio, Ros, PLC
199
+ Developer Tools: Tensorfiow, Pytorch, Google Cloud Platform, Huawei Cloud
200
+ Technologies/Frameworks: Linux, GitHub, HuggingFace, Kaggle, Selenium, Docker
201
+ Certificates
202
+ Teknofest Finalist Certificate: T3 Foundation
203
+ BTK Academy And Huawei Coding Marathon : Certificate of Competitio Winning (3rd)
204
+ EITCA Artificial Intelligence Academy 12 Certificates : European Union
205
+ TensorFlow: Advanced Techniques Specialization : Coursera
206
+ Google Cloud Expertise : Google
207
+ AI Expert Training Program 6 months_ : Republic Of Tirkiye Ministry of Industry and Technology
208
+ Oxford University English B2 Certification : ClubClass Language School MALTA
209
+ Introduction to Machine Learning in Production : Coursera
210
+ Hands-on ROS Training with Python : Udemy
211
+ Image processing with Deep learning : Udemy
212
+ Honors and Open Source Pojects
213
+ -Teknofest Healthcare Competition Finalist
214
+ -BTK Academy and Huawei Coder Marathon Third Place
215
+ -Kaggle Master https://www.kaggle.com/alperenclk
216
+ Some Articles
217
+ https://medium.com/@alperenclk/exploring-optical-character-recognition-ocr-with-streamlit-and-doctr-
218
+ 00e95ae36e4e
219
+ https://medium.com/@alperenclk/automating-telegram-game-bot-clicker-with-python-step-by-step-guide-
220
+ 1b9206188d06
221
+
222
+
223
+ ---------------------------------------------------------------------------
224
+
225
+ ---------------------------------------------------------------------------
226
+ Attributes of Output:
227
+ {"Model Names": ["EasyOCR", "DocTR", "Tesseract", "PaddleOCR"], "Language": "English", "Device": "CPU", "Process Time": 22.685651779174805}
228
+ OCR Result:
229
+ --- Page 1 (PaddleOCR) ---
230
+ RUNNING...
231
+ Stop
232
+ Deploy
233
+ Settings
234
+ Select Device
235
+ OCR and LLM Application
236
+ O CPU
237
+ Upload File (PDF, Image)
238
+ O GPU (CUDA)
239
+ Save Outputs
240
+ 4
241
+ Drag and drop file here.
242
+ Browse files
243
+ imit 200MB per file - PDF, PNG, JPG, JPEG
244
+ Select Language
245
+ English
246
+ Select OCR Models
247
+ EasyOCR x
248
+ DocTR x
249
+ Tesseract
250
+ PaddleOCR x
251
+ Select LLM Mode!
252
+ llama3.1
253
+ Enter command:
254
+ Select task type:
255
+ O Summarize
256
+ O Generate
257
+
258
+
259
+ ---------------------------------------------------------------------------
results/ocr_output_Tesseract.txt ADDED
@@ -0,0 +1,146 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---------------------------------------------------------------------------
3
+ Attributes of Output:
4
+ {"Model Names": ["EasyOCR", "Tesseract", "DocTR"], "Language": "English", "Device": "CPU", "Process Time": 31.933929920196533}
5
+ OCR Result:
6
+ --- Page 1 (Tesseract) ---
7
+ ALPEREN CELIK
8
+
9
+ 7-400 5453861875 WB alperoneSTONguil.com inked com/in/alperen-ell-TO95163/€ github, com/ Alperenelk/
10
+
11
+ Education
12
+ ‘Afyon Kocatepe University Sep. 2018 — Jan 2024
13
+ Bachelor of Mechatronic Engineering 3 gpa
14
+
15
+ Relevant Coursework
16
+
17
+ + Artifial nteligence + Software Methodology + Database Management» Internet Technology
18
+ 2 Cimputer Vision 1 Algartinas Analyste 1 bata Structuses 1 Syatens Progeamtning
19
+
20
+ Experience
21
+
22
+ Novelty AI Sep 2028 — Present
23
+
24
+ AU/ML Engineer Gebse, Tinaye
25
+
26
+ + Tdeveloped ientication system software fora bank using deep learning techniques. This system aimed to inrease
27
+ sccuit by optimizing eustomer authentication proceses and was sucessfully implemented
28
+
29
+ + Ata defense industry company, I managed the installation of industrial robots (ABB) om the production tine, In this
30
+ projet, Tprovided complex automation solutions to integrate robotic systems and increase operational eBclene.
31
+
32
+ + For one of the leading telecom companies in Turkey, I developed software that enable ive broadcasting and OTT
33
+ automation with Sutest software using Python and image procesing techniques
34
+
35
+ + For a beverage company, Iwas port of the team that developed an artical inteligence application that checks the
36
+ recognition and accuracy of product labels on the production ine
37
+
38
+ University of Malta Jul 2023 ~ Aug 2028 (3 mos)
39
+
40
+ AI Researcher Matte
41
+
42
+ + T worked on @ semi-autonomous drone that Wis to detect waste pet bottles on beads with artical intelligence Y used
43
+ Python and C+ languages inthis project
44
+
45
+ May 2019 — Jul 2022 (8 yes mos)
46
+ ini Servant Afyonkarhisar, Mirksye
47
+ + While studying Mechatvoles Engineering atthe university, I worked as a ull servant and earned a living. During these
48
+
49
+ ‘htee years, the most essential value that this job added to me was to develop tnyelfdlacplne and detrtnation in
50
+ fonder to keep the tough schol and work fe tn balance
51
+
52
+ DHMI Erzurum Airport Jul 2022 ~ Aug 2022 (2 mos)
53
+
54
+ Mechatwonses Engineer Intern Brsurumn, Tirkiye
55
+
56
+ + T worked an inter in azeas such as sensors electronic cards, s-zay deviees in terminal eeetronies. My most significant
57
+ sain from this interasip was leaening corporate work dicpline and internal yelationship techniques
58
+
59
+ Ecodation Jun 2021 ~ Jul 2021 (2 mos)
60
+ Python Developer Intern Ibtantad, Taye
61
+
62
+ + Tmade projects uch as navigation and customer tracking system fr cargo delivery. My main achievement was learning
63
+ to wotk asa teat,
64
+
65
+ Projects
66
+
67
+ Personalized Product Analysis with AL| Python, Net, Huawei Clout (Oct 2028
68
+ + Our appleation x designed to help users make informed and healthy choloss when purchasing products, By uploading a
69
+ photo ofthe product's ingredients the user can get detaled analysis of how stable and beneficial the product is for
70
+ ‘hem, The application scans the content of the produet with aetiial intelligence systems and identifies substances that
71
+ say cause allergis or adverse effects that the user has previously identified. Te provides the user with a summary’ of the
72
+ ‘Product's coutent and the presence or absence ofthe substances they have identified, Thanks to this application, we
73
+ fame Sed in BT Academy and Huawel Coding Marathon
74
+
75
+ ‘Multi View Breast Cancer Classification App | Python, PyQU, Deep loaning Ape 2022
76
+ + Within the scope of Teknofest artical intligence in heath competition, the team T captained ly developing a
77
+ sultimodel dep learning network forthe diagnosis of breast cancers succeded in becoming Bnalst,
78
+
79
+ Optical Character Recognition with Streamlt | Python, Steamtt, Hagningface Jan 2024
80
+ + OCR (Optical Character Recogution) technology has transformed how we interact with textual content ia the digital
81
+ realm By converting images, seanned documents, and other media into editable and searchable txt, OCR enables Us to
82
+ ‘extract valuable infrination from diverse sourecs.
83
+
84
+
85
+ --- Page 2 (Tesseract) ---
86
+ ‘Technical Skills
87
+
88
+ Languages: Python, C/ C++, Matlab, SQL, RobotStudlo, Ros, PLE
89
+ Developer Tools: Tensoiow. Pytore Google Cloud Platfnn, Huawel Cloud
90
+ ‘Technologies Frameworks: Linus, GitHub, Selenium, Docker
91
+
92
+ Certificates
93
+
94
+ ‘Telofest Finalist CertfieatecT3 Foundation
95
+ BTK Academy And Huawel Coding Marathon -Certiieate of Compatitio Wining (Sed)
96
+ EITCA Artificial Intelligence Academy 12 Certificates European Union
97
+
98
+ ‘TensorFlow: Advanced Techniques Speciallaation :Coursera
99
+
100
+ Google Cloud Expertise :G
101
+
102
+ AT Expert Training Progr dhs. -Republic OF Tiskiye Ministry of Industry and Technology
103
+ Introduction to Machine Learning in Production :Coutsers
104
+
105
+ Hands-on ROS Training with Python idem
106
+
107
+ Image processing with deep learning -Udeny
108
+
109
+
110
+
111
+
112
+ ---------------------------------------------------------------------------
113
+
114
+ ---------------------------------------------------------------------------
115
+ Attributes of Output:
116
+ {"Model Names": ["EasyOCR", "DocTR", "Tesseract", "PaddleOCR"], "Language": "English", "Device": "CPU", "Process Time": 22.685651779174805}
117
+ OCR Result:
118
+ --- Page 1 (Tesseract) ---
119
+ < RUNNING... Stop Deploy
120
+
121
+ Settings
122
+
123
+ covert OCR and LLM Application
124
+
125
+ @ cpu
126
+ GPU (CUDA)
127
+
128
+ @ Save Outputs
129
+
130
+ Select Language
131
+ English v
132
+ Select OCR Models
133
+
134
+ Eee Ene
135
+ (Tesseract x) [Paduleoce x) © ~
136
+
137
+ ‘Select LLM Model
138
+ Enter command:
139
+ Select task type:
140
+
141
+ ® Summarize
142
+ Generate
143
+
144
+
145
+
146
+ ---------------------------------------------------------------------------
sample_files/Screenshot 2024-07-13 163331.png ADDED

Git LFS Details

  • SHA256: 1229d8b03062391af846cc9e9af9bab04502665bf264cb9a595992d3617e0316
  • Pointer size: 131 Bytes
  • Size of remote file: 119 kB
sample_files/alperen_celik_14_08.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:700730cfc0595417324f787e2ad30b96763e0b551c95717e875d8e2191b6ebe9
3
+ size 131864
sample_files/medium_article_image.jpg ADDED

Git LFS Details

  • SHA256: 4929c9218153c970b91ecee59ea191dc4ef22453fb2add8ee5ed59f56c459140
  • Pointer size: 131 Bytes
  • Size of remote file: 264 kB
sample_files/sample_screen.png ADDED
streamlit_app.py ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Alias entrypoint for Streamlit on Hugging Face Spaces.
2
+ This is a copy of app.py to match Spaces' default file naming.
3
+ """
4
+
5
+ import streamlit as st
6
+ from PIL import Image
7
+ import fitz # PyMuPDF
8
+ import numpy as np
9
+ import tempfile
10
+ import os
11
+ import time
12
+ import io
13
+ import json
14
+ import torch
15
+ import cv2
16
+
17
+ # Import OCR engines
18
+ import ocr_engines
19
+
20
+ # Try importing LLM processor if LLM features are to be used
21
+ llm_available = False
22
+ try:
23
+ import llm_processor
24
+
25
+ llm_available = True
26
+ except ImportError:
27
+ pass # LLM features will be disabled
28
+
29
+ # Create results folder if it doesn't exist
30
+ if not os.path.exists("results"):
31
+ os.makedirs("results")
32
+
33
+ # Streamlit application
34
+ st.title("OCRInsight")
35
+
36
+ # Sidebar
37
+ st.sidebar.header("Settings")
38
+
39
+
40
+ # Function to save text to file
41
+ def save_text_to_file(attributes_of_output, all_ocr_text, filename):
42
+ with open(filename, "a", encoding="utf-8") as f:
43
+ f.write("\n" + "-" * 75 + "\n")
44
+ f.write("Attributes of Output:\n")
45
+ f.write(attributes_of_output)
46
+ f.write("\nOCR Result:\n")
47
+ f.write(all_ocr_text)
48
+ f.write("\n" + "-" * 75 + "\n")
49
+ st.success(f"{filename} saved successfully!")
50
+
51
+
52
+ # Device selection
53
+ device = st.sidebar.radio("Select Device", ["CPU", "GPU (CUDA)"])
54
+ save_output = st.sidebar.checkbox("Save Outputs")
55
+
56
+ # Language selection
57
+ language = st.sidebar.selectbox(
58
+ "Select Language", ["Türkçe", "English", "Français", "Deutsch", "Español"]
59
+ )
60
+
61
+ # Map selected language to language codes
62
+ language_codes = {
63
+ "Türkçe": "tr",
64
+ "English": "en",
65
+ "Français": "fr",
66
+ "Deutsch": "de",
67
+ "Español": "es",
68
+ }
69
+
70
+ # OCR model selection
71
+ ocr_models = st.sidebar.multiselect(
72
+ "Select OCR Models",
73
+ ["EasyOCR", "DocTR", "Tesseract", "PaddleOCR"],
74
+ ["EasyOCR"], # default selection
75
+ )
76
+
77
+ # LLM model selection
78
+ llm_model = st.sidebar.selectbox(
79
+ "Select LLM Model", ["Only OCR Mode", "llama3.1", "llama3", "gemma2"]
80
+ )
81
+
82
+ # Conditional UI elements based on LLM model selection
83
+ if llm_model != "Only OCR Mode" and llm_available:
84
+ user_command = st.sidebar.text_input("Enter command:", "")
85
+
86
+ task_type = st.sidebar.radio("Select task type:", ["Summarize", "Generate"])
87
+ elif llm_model != "Only OCR Mode" and not llm_available:
88
+ st.sidebar.warning(
89
+ "LLM features are not available. Please install 'ollama' to enable LLM processing."
90
+ )
91
+ llm_model = "Only OCR Mode"
92
+
93
+ # Check GPU availability
94
+ if device == "GPU (CUDA)" and not torch.cuda.is_available():
95
+ st.sidebar.warning("GPU (CUDA) not available. Switching to CPU.")
96
+ device = "CPU"
97
+
98
+ # Initialize OCR models
99
+ ocr_readers = ocr_engines.initialize_ocr_models(
100
+ ocr_models, language_codes[language], device
101
+ )
102
+
103
+ # File upload
104
+ uploaded_file = st.file_uploader(
105
+ "Upload File (PDF, Image)", type=["pdf", "png", "jpg", "jpeg"]
106
+ )
107
+
108
+ # Create results folder if it doesn't exist
109
+ if not os.path.exists("results"):
110
+ os.makedirs("results")
111
+
112
+ if uploaded_file is not None:
113
+ start_time = time.time()
114
+
115
+ if uploaded_file.type == "application/pdf":
116
+ pdf_document = fitz.open(stream=uploaded_file.read(), filetype="pdf")
117
+ images = []
118
+ for page_num in range(len(pdf_document)):
119
+ page = pdf_document.load_page(page_num)
120
+ pix = page.get_pixmap()
121
+ img_data = pix.tobytes("png")
122
+ img = Image.open(io.BytesIO(img_data))
123
+ images.append(img)
124
+ total_pages = len(pdf_document)
125
+ pdf_document.close()
126
+ else:
127
+ images = [Image.open(uploaded_file)]
128
+ total_pages = 1
129
+
130
+ all_ocr_texts = {
131
+ model_name: "" for model_name in ocr_models
132
+ } # To store OCR text for each model
133
+
134
+ for page_num, image in enumerate(images, start=1):
135
+ st.image(image, caption=f"Page {page_num}/{total_pages}", use_column_width=True)
136
+
137
+ # Perform OCR with each selected model
138
+ for model_name in ocr_models:
139
+ text = ocr_engines.perform_ocr(
140
+ model_name, ocr_readers, image, language_codes[language]
141
+ )
142
+ all_ocr_texts[
143
+ model_name
144
+ ] += f"--- Page {page_num} ({model_name}) ---\n{text}\n\n"
145
+
146
+ st.subheader(f"OCR Result ({model_name}) - Page {page_num}/{total_pages}:")
147
+ st.text(text)
148
+
149
+ end_time = time.time()
150
+ process_time = end_time - start_time
151
+
152
+ st.info(f"Processing time: {process_time:.2f} seconds")
153
+
154
+ # Save OCR outputs if selected
155
+ if save_output:
156
+ attributes_of_output = {
157
+ "Model Names": ocr_models,
158
+ "Language": language,
159
+ "Device": device,
160
+ "Process Time": process_time,
161
+ }
162
+ for model_name, ocr_text in all_ocr_texts.items():
163
+ filename = f"results//ocr_output_{model_name}.txt"
164
+ save_text_to_file(
165
+ json.dumps(attributes_of_output, ensure_ascii=False), ocr_text, filename
166
+ )
167
+
168
+ # LLM processing
169
+ if (
170
+ llm_model != "Only OCR Mode"
171
+ and llm_available
172
+ and st.sidebar.button("Start LLM Processing")
173
+ ):
174
+ st.subheader("LLM Processing Result:")
175
+
176
+ # Combine all OCR texts
177
+ combined_ocr_text = "\n".join(all_ocr_texts.values())
178
+
179
+ # Prepare the prompt based on the task type
180
+ if task_type == "Summarize":
181
+ prompt = f"Please summarize the following text. Command: {user_command}\n\nText: {combined_ocr_text}"
182
+ else: # "Generate"
183
+ prompt = f"Please generate new text based on the following text. Command: {user_command}\n\nText: {combined_ocr_text}"
184
+
185
+ llm_output = llm_processor.process_with_llm(llm_model, prompt)
186
+
187
+ # Display the result
188
+ st.write(f"Processing completed using '{llm_model}' model.")
189
+ st.text_area("LLM Output:", value=llm_output, height=300)
190
+
191
+ # Save LLM output if selected
192
+ if save_output:
193
+ filename = "llm_output.txt"
194
+ save_text_to_file(llm_output, "", filename)
195
+
196
+ elif llm_model != "Only OCR Mode" and not llm_available:
197
+ st.warning(
198
+ "LLM features are not available. Please install 'ollama' to enable LLM processing."
199
+ )
200
+
201
+ st.sidebar.info(f"Selected device: {device}")