E-Centerโs new course is oriented towards those coming from non-tech backgrounds: social sciences students and researchers, journalists, and public policy experts, who did not use programming and statistics in their practice before but would like to acquire new hard skills in data collection and data analysis.
The course consists of two modules that embrace all basic operations the researcher usually conducts: data collection, data cleaning and transformation, data visualization, and data analysis.
The first module focuses on data collection strategies on the Internet through parsing the websites. The online revolution in information exchange, social network revolution, and digitization of many spheres of peopleโs lives created almost unlimited pools of information and data available for collection and analysis. Data analysts, market research specialists, academic researchers and students, journalists, and investigators got access to a wide variety of opportunities to answer their questions through the analysis of publicly available data. The only issue is to collect relevant information in the most efficient way, turn it into data, and prepare it for analysis.
The second module is designed to equip participants with the basic theoretical knowledge of statistics and practical skills of data cleaning, transformation, visualization, and analysis in R statistical and programming environment. The combination of theoretical knowledge of statistics and programming skills opens new horizons and creates a powerful capability for data analysis tasks, which usually include cleaning the data, visualization, descriptive statistics, and hypothesis testing. The understanding of statistical concepts and fundamentals coupled with the ability to prepare data and properly analyze it gives researchers the opportunity to draw well-grounded causal conclusions.
MODULE โINTRO TO PARSING AND AUTOMATION OF DATA COLLECTION ON THE INTERNETโ
Topics (theory and practice)
1. Information, data, and automation: what is information, what is data, and how to automate the transformation of the former into the latter. Most typical forms of information/data available on the web (raw data โ text and numbers in text, tables, JSON format), types of websites
2. Parsing of a static website: html tags and the structure of the site, http requests, list and the transformation of data into dataframe
3. Parsing of a table: tables as the most common format of infodata on the web, how to approach the parsing of tables
4. APIs and parsing of JSON output/file: API, API requests, and parsing of JSON files
5. Parsing of dynamic website: Selenium library and accessing the content of dynamic web-sites
Lecturer: Iurii Agafonov
Duration: 4 weeks, 5 sessions, 2 hours per session
Starting date: February 27, 2023
Price: 35 000 dram
Language: English
Format: offline
What you get as a result of the course:
- Basic knowledge of Python (type of objects, functions, http requests, API requests, data transformation with pandas)
- Basic skills of parsing techniques to further deepening of your knowledge
- Examples of code to apply in your own data collection projects
MODULE โINTRO TO STATISTICS, DATA TRANSFORMATION, VISUALIZATION, AND ANALYSISโ
Topics
Theoretical part
1. Random variable, cumulative distribution function, probability density function
2. Moments of a random variable (expected value, variance). Correlation and covariation. Law of large numbers and central limit theorem
3. Statistical hypothesis testing. Studentโs t-test, Chi-squared test.
4. Empirical distribution function, sample statistics
5. Linear regression
Practical part
1. Quantitative methods: history, tools, main concepts. Introduction to R and RStudio
2. Data transformation and visualization: main rules and tools
3. Statistical hypothesis and errors. Chi-square statistics (????2). Statistical tests for contrasting of means
4. Correlation and covariation. Calculation of correlation coefficients. Construction of correlation matrices
5. Simple and multiple OLS regression. Diagnostics of regression models
What you get as a result of the course:
- Basic knowledge of statistics
- Basic knowledge of R and RStudio (tidyverse library, basic R functions)
- Basic skills in data transformation, visualization, and analysis in R
- Examples of code to apply in your own data analysis projects
Lecturer: Iurii Agafonov, Mikhail Sokolov
Duration: 5 weeks, 10 sessions, 2 hours per session
Starting date: March 27, 2023
Price: 48 000 dram
Language: Russian, English
Format: online, offline
Discount
It is possible to take only one of the modules of the course. But in case you decide to take both there is a 16% discount. Instead of 83 000 dram, the price will be 70 000 dram.
Special conditions: If you have any special requests, requirements, or questions (e.g. desired language or format of session) please do not hesitate to write to us. We will see what we can do about it.