Epython Lab
6.45K subscribers
660 photos
31 videos
104 files
1.22K links
Welcome to Epython Lab, where you can get resources to learn, one-on-one trainings on machine learning, business analytics, and Python, and solutions for business problems.

Buy ads: https://telega.io/c/epythonlab
Download Telegram
What Is a CSV File?

Text files aren’t the only thing that Python can read, but they’re the only thing that we don’t need any additional parsing library to understand. CSV files are an example of a text file that impose a structure to their data. CSV stands for Comma-Separated Values and CSV files are usually the way that data from spreadsheet software (like Microsoft Excel or Google Sheets) is exported into a portable format. A spreadsheet that looks like the following
Name Username Email
Asibeh Tenager asibeh asibeh@yahoo.com
Asibeh Tenager asibeh asibeh@yahoo.com




In a CSV file that same exact data would be rendered like this:

users.csv

Name,Username,Email, Asibeh Tenager, asibeh,
asibeh@yahoo.com

Notice that the first row of the CSV file doesn’t actually represent any data, just the labels of the data that’s present in the rest of the file. The rest of the rows of the file are the same as the rows in the spreadsheet software, just instead of being separated into different cells they’re separated by… well I suppose it’s fair to say they’re separated by commas.


#FaceMask #KeepDistancing #LearnPython #LearnDatascience
Learn Python for

Data Science
Artificial Intelligence
Machine Learning

Mode of Delivery: Short notes, Books, Articles, Exercises, Challenges etc.

🪐 Plan to build a big Dataset in Ethiopia which helps Developers and Researchers abroad.
Mastering_Large_Datasets_with_Python_Parallelize_and_Distribute.pdf
17.4 MB
Mastering Large Datasets with Python: Parallelize and Distribute Your Python Code (2020)

@python4fds
Forwarded from Deleted Account
challenge1.py
1.1 KB
The pie chart shows that 4.5 % of the total confirmed cases are died.
#EssayQuestion1

Why do you want to learn python?

Post your answer @pyDiscussion
Forwarded from Epython Lab (Asibeh Tenager)
Jupyter Notebooks

Jupyter Notebooks are an extremely powerful tool for data analysis because they allow you to run python commands and see outputs within the structure of a notebook, which is helpful because in Data Analysis you are often running short commands to produce the data/visualizations you need for a certain investigation.

To know how to install Jupyter notebook, looking at the following short video

https://www.youtube.com/watch?v=5Yx6h7Mgiv0
Do you know the difference between Data Science vs Machine Learning?

What are the main differences and similarities between data scientists and machine learning engineers?

Anyone can explain? @pythonEthbot
Data Science Vs Machine Learning


Introduction

It seems as though even companies along with their job descriptions have some confusion on what constitutes a data scientist and machine learning engineer.

At first, studying to become a data scientist. Data science is the researching, building, and interpretation of the model you have built, while machine learning is the production of that model.
Data Scientist

A statistician? Kind of. Data science, in it’s simplest terms, can be described as a field of automated statistics in the form of models that aide in classifying and predicting outcomes. Here are the top skills that are required to be a data scientist:

Python or R

SQL

Jupyter Notebook


Python — To expound on the skills above, most companies are looking for Python more than R. Some job descriptions list both; however, most people you are working with like the machine learning engineers, data engineers, and software engineers will not have familiarity with R. Therefore, to be a more holistic data scientist, Python will be more beneficial for you.

SQL, at first, can seem more like a data analyst skill — it is, but it should still be a skill you employ for data science. Most datasets are not given to you in the business setting (as opposed to academia), and you will have to make your own — via SQL. Now, there are plenty of subtypes of SQL; like PostgreSQL, MySQL, Microsoft SQL Server T-SQL, and Oracle SQL. They are similar forms of the same querying language, hosted by different platforms. Because these are so similar, having any of these is useful and can be translated easily to a slightly different form of SQL.

Jupyter Notebook could almost be the exact opposite of a machine learning engineer’s toolkit. A Jupyter Notebook is a data scientist’s playground for both coding and modeling. A research environment, if you will, allowing quick and easy Python coding that can incorporate commenting out of code, the code itself, and a platform to build and test models from useful libraries like sklearn, pandas, and numpy.

Overall, a data scientist can be many things, but the main functions are to

meet with stakeholders to define the business problem

— pull data (SQL)

— EDA, feature engineering, model building, & prediction (Python and Jupyter Notebook)

— depending on workplace, compile code to .py format and/or pickled model

... Will continue next time... You can put your comment or feedback @pythonethbot
11 Deep Learning With Python Libraries and Frameworks

Asked by one of the members

1⃣ TensorFlow Python. TensorFlow is an open-source library for numerical computation in which it uses data flow graphs. ...

2⃣ Keras Python. A minimalist, modular, Neural Network library, Keras uses Theano or TensorFlow as a backend. ...

3⃣ Apache mxnet. ...

4⃣ Caffe. ...

5⃣ Theano Python. ...

6⃣ Microsoft Cognitive Toolkit. ...

7⃣ PyTorch. ...

8⃣ Eclipse DeepLearning4J

Link: https://dzone.com/articles/11-deep-learning-with-python-libraries-and-framewo

You can put your question via @pythonethbot
Forwarded from Future Data Science(FDS) (Asibeh Tenager)
I want know your interest? I want to prepare a short video which teaches you a basic of python coding to advanced level.
Anonymous Poll
79%
Yes, I want
13%
No, but I want in text form
8%
No, I don't want at all
Machine Learning Engineer


Now, after that last point above, is where a machine learning engineer comes in. The main function is to put that model into production. A data science model can be quite static sometimes, and an engineer can help to automatically train and evaluate that same model. They would then insert the predictions back into the data warehouse/SQL tables for your company. After that, a software engineer and UI/UX designer will display the predictions into a user interface — if necessary. As you can see, the whole process from business problem to solution in a visible, easy to use format, is not just the responsibility of a data scientist (however, yes, some data scientists can do all x amount of roles).

The role of a machine learning engineer can be also named ML ops (machine learning operations). A summary of their workflow would be something like this:

A. pkl_file of data science model

B. storage bucket (GCP — Google Cloud Composer)

C. DAG (for scheduling the trainer and evaluator of the model)

D. Airflow (visualizes the process — ML pipeline)

E. Docker (containters and virtualization)

At first, perhaps data science and machine learning could be seen as interchangeable titles and fields; however, with a closer look, we realize machine learning is more-so a combination of software engineering and data engineering than data science.

In the next post, I will outline where the fields do and do not cross over.
#Python_list_challenge1
***
Write a function named append_sum that has one parameter — a list named named lst.

The function should add the last two elements of lst together and append the result to lst. It should do this process three times and then return lst. **

For example, if
lst started as [1, 1, 2], the final result should be [1, 1, 2, 3, 5, 8].
***
send your solution @pythonethbot
Similarities

Perhaps the most similar concept of data science and machine learning is that they both touch the model. The main skills that both fields share are:

SQL
Python
GitHub
Concept of training and evaluating data

The comparisons are primarily in programming; the languages each person uses to perform their respective roles. Both positions perform some form of engineering, whether that be a data scientist querying a database using SQL or the machine learning engineer using SQL to insert the suggestions or predictions from the model back into a newly labeled column/field.

Both fields require knowledge of Python (or R) and usually version control, code sharing, and pull requests through GitHub.

A machine learning engineer can sometimes want to know learn how the algorithms work like XGBoost or Random Forest, for example, and will need to look at the model’s hyperparameters for tuning in order to conduct research on memory and size constraints. While data scientists can build highly accurate models in academia or on the job, there can be more restrictions in the workplace due to time, money, and memory restraints.

Differences

Some of the differences are already outlined in the above sections of data science and machine learning, but there are some key features of both careers and academic research that are important to point out:

Data Science - focuses on statistics and algorithms

- unsupervised and supervised algorithms
- regression and classification
- interprets results
- presents and communicates results


Machine Learning - focus on software engineering and programming
- automation
- scaling
- scheduling
- incorporating model results into a table/warehouse/UI


Education

Not only can the two roles differ in the workplace, but in academia/education as well.

There are different routes to becoming a data scientist and machine learning engineer. A data scientist might focus on that degree itself, statistics, mathematics, or actuarial science, whereas a machine learning engineer will have their main focus on software engineering (and some institutions do offer specifically machine learning as a certificate or degree).
👍1
I think you got a little bit knowledge about Data Science and Machine Learning from the key notes I have posted so far.
public poll

Yes, I have got the difference and similarities of both. – 13
👍👍👍👍👍👍👍 81%
Meti, / /\, @Annanjr, @DerejeK, Lenjiso, Shubham, Abhinav, @Until_9, @L3bn4, anonymous, @StNati, @Jollya_Iru, Omnia

Yes, but I am confused – 3
👍👍 19%
Lil, @Codgunner, @Programmercplusplus

👥 16 people voted so far.
And the Best Programming Language for Data Science goes to…


The
reason for using an ellipsis in the title is that we have always looked at the wrong reasons for choosing a language. There are a bunch of factors that lead to the choice of a certain language. And with Data Science projects flooding the market, the question is NOT “which is the best language” but which one suits your project requirements and environment(work setting).

Most commonly used programming languages for Data Science

Python and R
are the most widely used languages among others( for example, Java, Scala, Matlab) for statistical analysis or machine learning-centric projects.

Both of these are state-of-the-art open-source programming languages with great community support. You keep learning about new libraries and tools achieving newer levels of performance and complexity.

Python

Python is well-known for its easy to learn and readable syntax. With a general-purpose(jack of all trades) language like Python, you can build complete scie
....continued ....

Python
is well-known for its easy to learn and readable syntax. With a general-purpose(jack of all trades) language like Python, you can build complete scientific ecosystems without worrying much about the compatibility or interfacing issues.

Python codes have low maintenance cost and they are arguably more robust. From data wrangling to feature selection, web scraping, and deployment of our machine learning models, python can get almost everything done with integration support from all the major ML and deep learning APIs like Theano, Tensorflow, and PyTorch.


R
was developed by academicians and statisticians over two decades ago. R today enables many statisticians, analysts, and developers to carry out their analysis. We have over 12000 packages available in CRAN (open-source repository).

Since it was developed keeping statisticians in mind, R becomes the first choice for all the core-scientific and statistical analysis. We have a package in R for almost every kind of analysis there is. Data analysis has been made very with tools like RStudio which allows you to communicate your results with concise and elegant reports.

4 Questions to learn about the BEST suited language for your project!

So, how does one make the right choice for their work at hand?

Try answering these 4 questions:
1. Which language/framework is preferred in your organisation/industry?

Depending on the industry you are working in and the most commonly used language by your peers and competitors, you might want to speak the same language. Here is an analysis carried out by David Robinson(Data Scientist), it’s a reflection of the popularity of R in an industry and you can see that R is outstandingly being used in Academia and Healthcare.

So, if you’re someone who wants to go into research, academia or bioinformatics, you might consider R over Python.

The other side of this coin is software industries, application-driven organizations, and product-based companies. You might have to go hand-in-hand with the tech stack of your organization’s infrastructure or the language that your colleagues/teams are using.

And most organizations/industries have their infrastructure based on Python including academia as well:

For an aspiring data scientist, it is a clear choice to learn something which has manifold applications and which could increase their chances of getting a job.
2. What is the scope of your project?

This is an important question because before you pick up a language, you must have an agenda for your project, the extent to which you want to work over it.

R: For example, if you want to simply solve a statistical problem through a dataset, perform some multi-variate analyses, and prepare a report or a dashboard explaining the insights, R might turn out to be a better choice because of its powerful visualization and communication libraries.

Python: On the other hand, if the aim is to first carry out exploratory analysis, develop a deep learning model and then deploy the model within a web application, Python’s web frameworks, and support from all the major cloud providers make it a clear winner.
3. How experienced are you in the field of data science?

For a beginner in data science who has limited familiarity with statistics and mathematical concepts, Python might turn out to be a better choice because it lets you code the fragments of an algorithm with ease.

With libraries like NumPy, you can manipulate matrices and code algorithms yourself. As a novice, it is always better to learn to build things from scratch rather than hopping onto using machine learning libraries.

Whereas if you already know the fundamentals of machine learning algorithms, you can pick up either of the languages to get started with.
4. How much time do you have at hand/cost of learning?

The amount of time you can invest makes another case for your choice. Depending on your experience with programming and the delivery time of your project, you might choose one language over another to get started in the field.
If there is a high-priority project and you don’t know either of the languages, R might be an easier option for you to get started as you need limited/no experience with programming. You can write statistical models with a few lines of code using existing libraries.

Python(a programmer’s choice) is a great option to start off with if you have some bandwidth to explore the libraries and learn about methods of exploring datasets which in case of R can be done quickly within Rstudio.