Monthly Archives: June 2024
Demystifying BERT: An Intuitive Dive into Transformer-based Language Models
The transformer neural network architecture initially created to solve the problem of language translation. It was very well-received because previous models like LSTM networks had several issues. Issues with LSTM Networks LSTM networks are slow to train because words are processed sequentially. It takes many time steps for the neural network to learn, and it…
Mastering Pandas: A Comprehensive Step-by-Step Guide to Efficient Data Analysis
In this blog post, I share all the Pandas materials that I have written over the past couple of months in an organized, easy-to-follow, logical, step-by-step tutorial. Whether you’re a beginner looking to get started or an experienced user aiming to deepen your understanding, this comprehensive guide will provide you with the essential functions and…
Harnessing the Power of Text Embeddings for Causal Inference
In the evolving landscape of data science, researchers and practitioners are continually seeking innovative ways to handle complex data types. One such advancement is the use of text embeddings, a powerful technique that transforms text data into meaningful numerical representations. This blog post delves into the intricate world of text embeddings and explores how they…
Unveiling Double/Debiased Machine Learning (DML): A Practical Guide
Understanding the true effect of a variable (like a new medication or policy) on an outcome (such as health improvement or economic growth) can be challenging. Confounding variables—factors that affect both the treatment and the outcome—often complicate this task. Double/Debiased Machine Learning (DML) provides a powerful method to uncover these causal relationships, even in complex,…
Introduction to Testing DAG Validity: Local Markov and Edge Dependence Tests
In the realm of data science and causal inference, Directed Acyclic Graphs (DAGs) are powerful tools for modeling the causal relationships between variables. However, creating a DAG is only the first step. To ensure the accuracy and reliability of the causal inferences drawn from these models, we need to validate that the DAG accurately represents…