Courses
Required Courses
DATA 0200 Data Analytics and Machine Learning with Python. Data analytics and machine learning using Python. Fundamentals of programming in Python and practical experience working with data, as well as the traditional Machine Learning approach (train-test-validate) on supervised and unsupervised models. Analysis of the appropriate use and potential misuse of data; critical thinking about statistical evidence and analytical conclusions.(Fall)
DATA 0201 Database Design and SQL. Fundamentals of Structured Query Language (SQL). Principles of relational database design and use. Queries for managing database and manipulating data. Upon completion, students can query relational databases from any vendor and design databases for any application. (Fall)
DATA 0202 Data Analytics with R. Statistical analysis and visualization with R. Fundamentals of R; examination of how data are collected, modeled, and interpreted across disciplines in the arts, humanities, and sciences. Ethical and methodological considerations in data analysis using R; development of skills to identify misuse of data and critically evaluate statistical evidence and conclusions. (Spring).
DATA 0220 Communicating with Data. Multiple approaches to communicating data, including visual, oral, and written methods. Emphasis on analytical thinking, best practices in data visualization and storytelling, and ethical considerations in the presentation and interpretation of data. Design and communicate insights effectively for diverse audiences using data visualization tools such as Tableau, Power BI, and Plotly. Course activities include analyzing case studies and presentations based on real-world data. (Spring)
DATA 0296 Capstone Internship (Full-Time). Supervised, full-time professional internship in data analytics applying program knowledge to real-world projects or data sets in a professional setting. Minimum of 350 hours completed during the summer or academic year; enrollment confers full-time status. Emphasis on development of applied problem-solving and workplace skills valued by employers.
DATA 0299 Capstone Internship. This course provides students with the opportunity to apply the knowledge and problem-solving skills they've learned over with course of the program with real-world projects or data sets within a professional context. Students must complete a minimum of 100 hours of a professional internship involving data analytics. This is envisioned to be accomplished during either the summer or academic year and is critical to the building of the problem-solving soft skills required by employers.
Note: Data Analytics students may enroll in either DATA 296 or DATA 299 to complete the Capstone Internship requirement.
Elective Courses
DATA 0221: Introduction to Natural Language Processing. Fundamentals of Natural Language Processing (NLP) and Large Language Models (LLM). Machine Learning (ML) based NLP techniques, including both statistical ML approaches and Deep Neural Networks (DNN). Theoretical foundations of deep learning architectures including MLP, RNN, LSTM, and the Transformer architecture that powers modern NLP and LLMs. Hands-on implementations of ML algorithms and DNNs for NLP using the Tensorflow Keras framework. Overview of State-Of-The-Art AI Agents. Opportunity to carry out a mini NLP research project following industry standard practices.
Prerequisites: proficient in python coding; familiarity with probability, linear algebra and calculus. (Fall)
DATA 0222 Deep Learning for Multimodal AI Modern Artificial Intelligence (AI) employs a variety of modalities, including image, audio, and text. Introduction to deep learning and its domain-specific applications from computer vision and Large Language Models (LLMs) to audio and speech recognition. Hands-on experience building deep learning models (CNN, RNN, Transformers, etc.) and AI systems using frameworks such as PyTorch and Tensorflow Keras. Survey of State-Of-The-Art literature on multimodal AI such as CLIP (Language-Image) and CLAP (Language-Audio). Opportunity to carry out a mini AI research project following industry standard practices.
Prerequisites: proficiency in Python; and familiarity with probability, linear algebra, and calculus prior to enrollment. (Spring).
DATA 0297 Special Topics: Introduction to Data Engineering. This course covers fundamental Data Engineering concepts and techniques for building reliable data pipelines. Students will learn to design, implement, and maintain systems that collect, transform, and deliver data. Topics include ETL processes, data modeling, distributed processing frameworks, pipeline orchestration, and incorporating generative AI and other technologies into intelligent data pipelines, We will examine tooling ranging from bash scripts to DuckDB to managed Spark clusters to transform small and large datasets and orchestrate intricate data flows while communicating the usefulness and limitations of data products to stakeholders. Students will acquire practical skills applicable across a variety of industries and technical roles, including analytics engineer, data engineer, and data analyst. Prerequisites: Proficient in Python; familiarity with SQL and basic cloud concepts (Fall)
DATA 0297 Special Topics: Healthcare Analytics In this course, we will examine how data and information are used and analyzed to improve clinical outcomes and improve efficiency in the US healthcare system. After reviewing the basics of how the US healthcare system works viewed through the lens of data, topics in the class will include using data derived from electronic health records (EHR) systems, lab information systems, and the basics of several interoperability data standards such as HL7 v2 and HL7 FHIR. We will highlight how various types of healthcare data can be utilized to drive clinical improvements for patients, increase operational efficiency, and lessen administrative burdens in healthcare, including the use of generative AI tools. Prerequisites: Familiar with Python and SQL (Spring).