ACM India - IKDD Summer School on Data Science Hosted by IIT Gandhinagar
Sponsored by ShareChat
Dates: 4 to 16 July 2022
Name of the school: ACM India – IKDD Summer School on Data Science
Host Institution’s Name and Address: IIT Gandhinagar, Gandhinagar, Gujarat
Industry sponsor: ShareChat
Dates: 4th July 2022 - 16th July 2022
Academic Coordinator
- Name: Anirban Dasgupta
- Email: [email protected]
Description of school:
This school is about the algorithmic, statistical, and engineering challenges associated with various stages of data analysis. Each of the sub-topics will include both theoretical and hands-on aspects. We will cover how to collect and clean up data, probabilistic models for data, and various algorithmic challenges that arise when scaling these models to large data. We will also do deep dives for data-driven modeling in three different scientific domains – natural language processing, computer vision, and earth and climate sciences. Lastly, we will learn how to deploy a machine learning model in production and keep it up-to-date. There will be multiple lectures on each subtopic, and participants will be taken from the basics to some of the cutting-edge questions in these areas.
List of subtopics:
- introduction to data collection pipeline, tools and techniques for data processing (e.g., normalization, outlier removal), descriptive statistics, visualization
- models for supervised learning – MLE, MAP, and fully Bayesian modeling, clustering, matrix factorization, spatio-temporal data modeling
- algorithms for data and dimension reduction
- experiment design and model evaluation, A/B testing etc.
- modern models for NLP, computer vision, data-driven modeling for earth and climate sciences
- data-science lifecycle, standard practices of MLOps
List of speakers (with affiliation): The current set of confirmed speakers is the following.
- Ashish Tendulkar (Google)
- Nipun Batra (IITGN)
- Mayank Singh (IITGN)
- Anirban Dasgupta (IITGN)
- Shanmuga Raman (IITGN)
- Udit Bhatia (IITGN)
- Surender Kumar (Flipkart)
- Satyanath Bhat (IIT Goa)
- Shivam Rana (Swiggy)
Background / prior courses recommended:
The following background is expected from the participants. The links curated contain material that can be used to revise/pick up the necessary material.
- Programming
- A course on python programming, e.g. this course on NPTEL or this one that includes data structures.
- Linear algebra
- 3Blue1Brown playlist on linear algebra
- MIT-OCW Introduction to Linear algebra
- Basics of data structures and algorithms
- NPTEL course by Prof Naveen Garg.
- Introductory probability
- MIT-OCW (first ten lectures should suffice)
- Khan-Academy series on probability
Any specific software (Matlab, Python etc) to be used:
- Python
Detailed schedule: https://labs.iitgn.ac.in/datascience/summer-school/