Epython Lab

#Keynote #DataScience #NLP #Python @epythonlab

Natural Language Processing

Natural language processing (NLP) is the field devoted to methods and algorithms for processing human (natural) languages for computers. NLP is a vast discipline that is actively being researched. Some examples of machine learning applications using NLP include sentiment analysis, topic modeling, and language translation. In NLP, the following terms have specific meanings:

- Corpus: The body/collection of text being investigated.
- Document: The unit of analysis, what is considered a single observation.

Examples of corpora include a collection of reviews and tweets, the text of the Iliad, and Wikipedia articles. Documents can be whatever you decided, it is what your model will consider an observation. For the example when the corpus is a collection of reviews or tweets, it is logical to make the document a single review or tweet. For the example of the text of the Iliad, we can set the document size to a sentence or a paragraph. The choice of document size will be influenced by the size of our corpus. If it is large, it may make sense to call each paragraph a document. As is usually the case, some design choices that need to be made.

1.31K viewsedited 09:31

Epython Lab

1.3K views08:37

1.32K views08:40

💻 Linear Algebra for Natural Language Processing

https://www.kdnuggets.com/2021/08/linear-algebra-natural-language-processing.html

Code: https://github.com/Taaniya/linear-algebra-for-ml

@epythonlab #nlp #code #article

3.38K views14:46

Epython Lab

Forwarded from Epython Lab

1.78K views13:35

Epython Lab

Implementing Bag of Words in Python
https://www.youtube.com/watch?v=3aZGZmx4HgU&list=PL0nX4ZoMtjYEIAxICDt2IOjdjGfFYDtYp

YouTube

Machine Learning Project : Implementing Bag of Words | Converting Words to Lowercase

Join this channel to get access to perks:
https://bit.ly/363MzLo
In this tutorial, you will be learning how to implement a bag of words without using sklearn. In this video, you will learn how to convert the document set into lowercase.
#python #machinelearning…

👍5

1.18K viewsedited 19:20

Epython Lab

📚 5 Free Books on Natural Language Processing to Read in 2023

1. Speech and Language Processing
Authors: Dan Jurafsky and James H. Martin

2. Foundations of Statistical Natural Language Processing
Authors: Christopher D. Manning and Hinrich Schütze

3. Pattern Recognition and Machine Learning
Author: Christopher M. Bishop

4. Neural Network Methods in Natural Language Processing
Author: Yoav Goldberg

5. Practical Natural Language Processing

https://epythonlab.t.me/

#books #python #machinelearning #nlp

❤5👍2

1.37K views09:52

Epython Lab

📚 5 Free Books on Natural Language Processing to Read in 2024

1. Speech and Language Processing
Authors: Dan Jurafsky and James H. Martin

2. Foundations of Statistical Natural Language Processing
Authors: Christopher D. Manning and Hinrich Schütze

3. Pattern Recognition and Machine Learning
Author: Christopher M. Bishop

4. Neural Network Methods in Natural Language Processing
Author: Yoav Goldberg

5. Practical Natural Language Processing

https://epythonlab.t.me/

#books #python #machinelearning #nlp

❤3👍3

1.66K views02:53

Epython Lab

📢Day 16/100: Tackling Amharic NLP Challenges

Amharic presents unique challenges in natural language processing (NLP), from its complex script to a lack of annotated datasets.

My approach: Fine-tune Large Language Models (LLMs) for Amharic Named Entity Recognition (NER) to extract product names, prices, and locations from Telegram messages.

💡 Discussion: What strategies can we adopt to make NLP more accessible for low-resource languages like Amharic?

#NLP #AI #Amharic #FintechEthiopia

1.04K views09:55

Epython Lab

📢Day 19/100: Choosing the Right Language Model

For Amharic Named Entity Recognition, we fine-tuned three models:

1️⃣ XLM-Roberta: Best for multilingual NLP.

2️⃣ mBERT: Balanced performance.

3️⃣ DistilBERT: Lightweight but slightly less accurate.

💡 Insight: XLM-Roberta outperformed others in accuracy and entity recognition for Amharic e-commerce data.

💡 Question: What’s your experience with fine-tuning NLP models for underrepresented languages?

#AI #NLP #ModelSelection #FintechAfrica

1.08K views10:06

Epython Lab

📢𝗗𝗮𝘆 𝟮𝟭/𝟭𝟬𝟬: 𝗧𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗔𝗺𝗵𝗮𝗿𝗶𝗰 𝗡𝗘𝗥 𝗠𝗼𝗱𝗲𝗹𝘀

I fine-tuned models on 27,989 labeled examples, optimizing key parameters:

- Learning rate: Experimented to find the sweet spot.

- Batch size: Limited to 16 to manage memory constraints.

- Metrics: Focused on precision, recall, and F1-score.

💡 Finding: Smaller batches helped balance performance and computational efficiency.

💡 Question: How do you optimize parameters for low-resource NLP tasks?

#AI #ModelTraining #Ethiopia #NLP

997 views06:26

About

Blog

Apps

Platform