Fetch Data from PostgreSQL Databases in Python
We use pandas and psycopg2
together to connect with PostgreSQL. psycopg2
is a package allows us to create a connection with PostgreSQL databases in Python, and we will use sqlio
within pandas
to interact with the database.
Introduction to Anomaly Detection
Anomaly Detection (a.k.a Outlier Detection) is a process of detecting unexpected observations in specified datasets.
Censored Data and Survival Analysis
Censorships in data is a condition in which the value of a measurement or observation is only partially observed. Censored data is one kind of missing data, but is different from the common meaning of missing value in machine learning. We usually observe censored data in a time-based dataset. In such datasets, the event is been cut off beyond a certain time boundary. We can apply survival analysis to overcome the censorship in the data.
Treatments for Imbalanced Dataset
Imbalanced datasets are a common problem in classification tasks in machine learning. Take credit card fraud prediction as a simple example: the target values are either fraud (1) or not fraud (0), but the number of fraud (1) could only be less than one percent of the whole dataset.
Export stargazer table in .tex & .html
For statistical analysis, we usually use stargazer package in R to generate tables with regression analysis results.