Machine Learning Reading Diary

Bedouin
5 min readApr 30, 2020

Interesting articles and page links on Machine Learning and Data Science. This might be helpful to aspiring programmers and analysts.

May 13,2020

A Simple Explanation of Information Gain and Entropy
https://victorzhou.com/blog/information-gain/

Fastest way to compute entropy in Python
https://stackoverflow.com/questions/15450192/fastest-way-to-compute-entropy-in-python

Decision Tree Classification in Python
https://www.datacamp.com/community/tutorials/decision-tree-classification-python

Simple Line Plots
https://jakevdp.github.io/PythonDataScienceHandbook/04.01-simple-line-plots.html

Annotate matplotlib chart
https://python-graph-gallery.com/193-annotate-matplotlib-chart/

DecisionTreeClassifier
https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html#sklearn.tree.DecisionTreeClassifier

Information Gain in Decision Tree Split
https://statinfer.com/204-3-5-information-gain-in-decision-tree-split/

What is “entropy and information gain”
https://stackoverflow.com/questions/1859554/what-is-entropy-and-information-gain

Controlling figure aesthetics
https://seaborn.pydata.org/tutorial/aesthetics.html

Adding a legend to PyPlot in Matplotlib in the simplest manner possible
https://stackoverflow.com/questions/19125722/adding-a-legend-to-pyplot-in-matplotlib-in-the-simplest-manner-possible

Matplotlib scatter plot with legend
https://stackoverflow.com/questions/26558816/matplotlib-scatter-plot-with-legend

Plotting decision boundary of logistic regression
https://stackoverflow.com/questions/28256058/plotting-decision-boundary-of-logistic-regression

NumPy: Count the frequency of unique values in numpy array
https://www.w3resource.com/python-exercises/numpy/python-numpy-exercise-94.php

Make blobs
https://scikit-learn.org/stable/modules/generated/sklearn.datasets.make_blobs.html

Scatter plot of 1-D bimodal data from sklearn make_blobs()
https://stackoverflow.com/questions/55321496/scatter-plot-of-1-d-bimodal-data-from-sklearn-make-blobs

Python sklearn.datasets.make_blobs() Examples
https://www.programcreek.com/python/example/82898/sklearn.datasets.make_blobs

How to set the range of y-axis for a seaborn boxplot
https://stackoverflow.com/questions/33227473/how-to-set-the-range-of-y-axis-for-a-seaborn-boxplot/33227833

How to insert an inline image in Google Colaboratory from Google Drive
https://stackoverflow.com/questions/50670920/how-to-insert-an-inline-image-in-google-colaboratory-from-google-drive

May 10,2020

PyTorch Discussion Tips and Tricks
https://anmoljoshi.com/Pytorch-Dicussions/

May 06,2020

Pandas: filling missing values by mean in each group
https://stackoverflow.com/questions/19966018/pandas-filling-missing-values-by-mean-in-each-group/45373095

df['value'] = df['value'].fillna(df.groupby('name')['value'].transform('mean'))

Counting non zero values in each column of a dataframe in python
https://stackoverflow.com/questions/26053849/counting-non-zero-values-in-each-column-of-a-dataframe-in-python

df.astype(bool).sum(axis=0)

Handling division by zero in Pandas calculations
https://stackoverflow.com/questions/45540015/handling-division-by-zero-in-pandas-calculations

(a / b).replace(np.inf, 0)

Pandas error when using if-else to create new column: The truth value of a Series is ambiguous
https://stackoverflow.com/questions/48123368/pandas-error-when-using-if-else-to-create-new-column-the-truth-value-of-a-serie/48123413

Python : How to use if, else & elif in Lambda Functions
https://thispointer.com/python-how-to-use-if-else-elif-in-lambda-functions/

lambda <arguments> : <Return Value if condition is True> if <condition> else <Return Value if condition is False>

Pandas DataFrame apply
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html

df.apply(lambda x: [1, 2], axis=1)

Pandas DataFrame iteritems
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iteritems.html

for label, content in df.items():

Pandas DataFrame fillna
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.fillna.html

df.fillna(value=values)

SimpleImputer
https://scikit-learn.org/stable/modules/generated/sklearn.impute.SimpleImputer.html

from sklearn.impute import SimpleImputer
imp_mean = SimpleImputer(missing_values=np.nan, strategy='mean')

May 02,2020

Working with Missing Data in Pandas
https://www.geeksforgeeks.org/working-with-missing-data-in-pandas/

Plot multiple columns of pandas data frame on the bar chart
df.plot(x="X", y=["A", "B", "C"], kind="bar")
https://stackoverflow.com/questions/42128467/matplotlib-plot-multiple-columns-of-pandas-data-frame-on-the-bar-chart

Selecting multiple columns in a pandas dataframe
df1 = df[['a','b']]
https://stackoverflow.com/questions/11285613/selecting-multiple-columns-in-a-pandas-dataframe

Select rows containing certain values from Pandas DataFrame
df[df.values == 'banana']
df[df.isin(values).any(1)]

https://stackoverflow.com/questions/38185688/select-rows-containing-certain-values-from-pandas-dataframe

Python Pandas : How to Drop rows in DataFrame by conditions on column values
df.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
https://thispointer.com/python-pandas-how-to-drop-rows-in-dataframe-by-conditions-on-column-values/

May 01,2020

Delete rows from a pandas DataFrame based on a conditional expression
df.drop(df[df.score < 50].index, inplace=True)
https://stackoverflow.com/questions/13851535/delete-rows-from-a-pandas-dataframe-based-on-a-conditional-expression-involving

How to Get Frequency Counts of a Column in Pandas Dataframe: Pandas Tutorial
df['continent'].value_counts()
https://cmdlinetips.com/2018/02/how-to-get-frequency-counts-of-a-column-in-pandas-dataframe/

Subset of the DataFrame’s columns based on the column dtypes
df.select_dtypes(include=['category']) // Pandas categorical dtypes
df.select_dtypes(exclude=['int'])
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.select_dtypes.html

Handle missing values in pandas
https://www.youtube.com/watch?v=fCMrO_VzeL8

Python pandas Q&A video series
https://github.com/justmarkham/pandas-videos

April 24, 2020

Deep Learning with PyTorch: A 60 Minute Blitz
https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

--

--

Bedouin

The Invisible Man | Machine Learning Engineer, Programmer, Tech Enthusiast