Tag Archives: gather


The U.S. Census Bureau is one of the most comprehensive government sources for data in the United States. It conducts surveys and collects information on a wide range of demographic and economic topics on an ongoing basis. Some key datasets available from the Census Bureau that are useful for student capstone projects include:

American Community Survey (ACS): An ongoing survey that provides vital information on a yearly basis about the U.S. population, housing, social, and economic characteristics. Data is available down to the block group level.

Population estimates: Provides annual estimates of the resident population for the nation, states, counties, cities, and towns.

Economic Census: Conducted every 5 years, it provides comprehensive, detailed, and authoritative data about the structure and functioning of the U.S. economy, including statistics on businesses, manufacturing, retail trade, wholesale trade, services, transportation, and other economic activities.

County Business Patterns: Annual series that provides subnational economic data by industry with employment levels and payroll information.

The National Center for Education Statistics (NCES) maintains a wide range of useful datasets related to education in the United States. Examples include:

Private School Universe Survey (PSS): Provides the most comprehensive, current, and reliable data available on private schools in the U.S. Data includes enrollments, teachers, finances, and operational characteristics.

Common Core of Data (CCD): A program of the U.S. Department of Education that collects fiscal and non-fiscal data about all public schools, public school districts, and state education agencies in the U.S. Includes student enrollment, staffing, finance data and more.

Schools and Staffing Survey (SASS): Collects data on the characteristics of teachers and principals and general conditions in America’s elementary and secondary schools. Good source for research on education staffing issues.

Early Childhood Longitudinal Study (ECLS): Gathers data on children’s early school experiences beginning with kindergarten and progressing through elementary school. Useful for developmental research.

Two additional federal sources with extensive publicly available data include:

The National Institutes of Health (NIH) via NIH RePORTer – Searchable database of federally funded scientific research projects conducted at universities, medical schools, and other research institutions. Can find data and studies relevant to health/medicine focused projects.

The Department of Labor via data.gov and API access – Provides comprehensive labor force statistics including employment levels, wages, employment projections, consumer spending patterns, occupational employment statistics and more.Valuable for capstones related to labor market analysis.

Some other noteworthy data sources include:

Pew Research Center – Nonpartisan provider of polling data, demographic trends, and social issue analyses. Covers a wide range of topics including education, health, politics, internet usage and more.

Gallup Polls and surveys – Leader in daily tracking and large nationally representative surveys on all aspects of life. Good source for attitude and opinion polling data.

Federal Reserve Economic Data (FRED) – Extensive collections of time series economic data provided by the Federal Reserve Bank of St. Louis. Covers GDP, income, employment, production, inflation and many other topics.

Data.gov – Central catalog of datasets from the U.S. federal government including geospatial, weather, environment and many other categories. Useful for exploring specific agency/government program level data.

In addition to the above government and private sources, academic libraries offer access to numerous databases from private data vendors that can supplement the publicly available sources. Examples worth exploring include:

ICPSR – Interuniversity Consortium for Political and Social Research. Vast archive of social science datasets with strong collections in public health, criminal justice and political science.

IBISWorld – Industry market research reports with financial ratios, revenues, industry structures and trends for over 700 industries.

ProQuest – Extensive collections spanning dissertations, newspapers, company profiles and statistical datasets. Particularly strong holdings in the social sciences.

Mintel Reports – Market research reports analyzing thousands of consumer packaged goods categories along with demographic segmentation analysis.

EBSCOhost Collections – Aggregates statistics and market research from numerous third party vendors spanning topics like business, economics, psychology and more.

So Students have access to a wealth of high-quality, publicly available data sources from governments, non-profits and academic library databases that can empower strong empirical research and analysis for capstone projects across a wide range of disciplines. With diligent searching, consistent data collection practices like surveys can be located to assemble time series datasets ideal for studying trends. The above should provide a solid starting point for any student looking to utilize real-world data in their culminating undergraduate research projects.


Gathering user feedback is crucial after the initial launch of any new software, product, or service. It allows companies to understand how real people are actually using and experiencing their offering, identify issues or opportunities for improvement, and make informed decisions on what to prioritize for future development.

For our initial launch, we had a multi-pronged approach to feedback collection that involved both quantitative and qualitative methods. On the quantitative side, we implemented tracking of key metrics within the product itself such as active user counts, time spent on different features, error/crash rates, completion of onboarding flows, and conversion rates for core tasks. This data was automatically collected in our analytics platform and provided insights into what parts of the experience were working well and where users may be dropping off.

We also implemented optional in-product surveys that would pop up after significant user milestones like completing onboarding, making a purchase, or using a new feature for the first time. These surveys asked users to rate their satisfaction on various aspects of the experience on a 1-5 star scale as well as leaving open comments. Automatic trigger-based surveys allowed us to collect statistically meaningful sample sizes of feedback on specific parts of the experience.

In addition to in-product feedback mechanisms, we initiated several email campaigns targeting both active users as well as people who had started but not completed the onboarding process. These emails simply asked users to fill out an online survey sharing their thoughts on the product in more depth. We saw response rates of around 15-20% for these surveys which provided a valuable source of qualitative feedback.

To gather perspectives from customers who did not complete the onboarding process or become active users, we also conducted interviews with 10 individuals who had started but not finished signing up. These interviews dug into the specific reasons for drop-off and pain points encountered during onboarding. Insights from these interviews were especially helpful for identifying major flaws to prioritize fixing in early updates.

For active customers, we hosted two virtual focus groups with 5 participants each to get an even deeper qualitative understanding of how they used different features and what aspect of the experience could be improved. Focus groups allowed participants to build off each other’s responses in a dynamic discussion format which uncovered nuanced feedback.

In addition to directly surveying and interviewing users ourselves, we closely monitored forums both on our website as well as general discussion sites online for unprompted feedback. Searching for mentions of our product and service on sites like Reddit and Twitter provided a window into conversations we were not directly a part of. We also had a dedicated email for user support tickets that generated a wealth of feedback as customers reached out about issues or requested new features.

Throughout the process, all feedback received both quantitative and qualitative was systematically logged, tagged, and prioritized by our product and design teams. The in-product usage metrics were the biggest driver of prioritization, but qualitative feedback helped validate hypotheses and shed new light on problems detected in analytics. After distilling learnings from all sources into actionable insights, we then made several iterative updates within the first 3 months post-launch focused on improving core tasks, simplifying onboarding flows, and addressing common pain points.

Following these initial rounds of updates, we repeated the full feedback collection process to gauge how well changes addressed issues and to continue evolving the product based on a continuous feedback loop. User research became embedded in our core product development cycle, and we now have dedicated staff focused on ongoing feedback mechanisms and usability testing for all new features and experiments. While collecting feedback requires dedicated resources, it has proven invaluable for understanding user needs, identifying problems, building trust with customers, and delivering the best possible experience as our service continues to evolve.


To effectively gather and analyze usage metrics for your mobile app, there are a few key steps you need to take:

Integrate Analytics Software

The first step is to integrate an analytics software or SDK into your mobile app. Some top options for this include Google Analytics, Firebase Analytics, Amplitude, and Mixpanel. These platforms allow you to easily track custom events and user behavior without having to build the functionality from scratch.

When selecting an analytics platform, consider factors like cost, features offered, SDK ease of use, and data security/privacy. Most offer free tiers that would be suitable for early-stage apps. Integrating the SDK usually just requires adding a few lines of code to connect your app to the platform.

Track Basic Metrics

Once integrated, you’ll want to start by capturing some basic usage metrics. At a minimum, track metrics like active users, session counts, sessions per user, average session duration, and app installs. Tie these metrics to dates/times so you can analyze trends over time.

Also track device and OS information to understand where your users are coming from. Additional metrics like app opens, screen views, and location can provide further insights. The analytics platform may already capture some of these automatically, or you may need to add custom event tracking code.

Track Custom Events

To understand user behavior and funnel metrics, you’ll need to track custom events for key actions and flows. Examples include buttons/links tapped, tours/onboarding flows completed, items purchased, levels/stages completed, account registrations, share actions, etc.

Assign meaningful event names and pass along relevant parameters like items viewed/purchased. This allows filtering and segmentation of your data. Tracking goals like conversions is also important for analyzing success of app changes and experiments.

Integrate Crash Reporting

It’s critical to integrate crash reporting functionality as bugs and crashes directly impact the user experience and retention. Tools like Crashlytics and Sentry integrate seamlessly with popular analytics platforms to capture detailed crash logs and automatically tie them to user sessions.

This helps you quickly understand and fix crash causes to improve stability. Crash reports coupled with your usage data also illuminatecrash-prone behaviors to avoid when designing new features.

Analyze the Data

With data pouring in, you’ll want to analyze the metrics and create custom reports/dashboards. Look at indicators like retention, engagement, funnel drops, crash rates, revenue/conversions over time. Filter data by cohort, country, device type and more using segmentation.

Correlate metrics to understand relationships. For example, do users who complete onboarding have higher retention? Analyze metric differences between releases to understand what’s working. Set goals and KPIs to benchmark success and inform future improvements.

Periodically analyze usage qualitatively via user interviews, surveys and usability testing as well. Analytics only show what users do, not why – thus qualitative feedback is crucial for deeper understanding and ensuring your app meets real needs.

Make Data-Driven Decisions

With analysis complete, you’re ready to start making data-driven product decisions. Prioritize the improvements or features that analytics and user feedback point to for having the biggest impact.

Continuously use analytics to test hypotheses via A/B experiments, validate that changes achieve their goals, and iterate based on multichannel feedback loops. Gradually optimize key metrics until your retention, user satisfaction, and conversions are maximized based on evidence, not assumptions.

Continue Tracking Over Time

It’s important to continuously track usage data for the lifetime of your app through updates and growth. New releases and changes may impact metrics significantly – only ongoing tracking reveals these trends.

As your user base expands, drilling data down to specific cohorts becomes possible for more granular and actionable insights. Continuous insights also inform long term product strategies, marketing campaigns and monetization testing.

Comprehensive usage analytics are crucial for building a successful mobile app experience. With the right planning and integrations, leveraging data to understand user behavior and drive evidence-based decisions can significantly boost metrics like retention, engagement, satisfaction and ROI over the long run. Regular analysis and adaptation based on fresh data ensures your app always meets evolving user needs.


The first step is to gather customer data from your company’s CRM, billing, support and other operational systems. The key data points to collect include:

Customer profile information like age, gender, location, income etc. This will help identify demographic patterns in churn behavior.

Purchase and usage history over time. Features like number of purchases in last 6/12 months, monthly spend, most purchased categories/products etc. can indicate engagement level.

Payment and billing information. Features like number of late/missed payments, payment method, outstanding balance can correlate to churn risk.

Support and service interactions. Number of support tickets raised, responses received, issue resolution time etc. Poor support experience increases churn likelihood.

Marketing engagement data. Response to various marketing campaigns, email opens/clicks, website visits/actions etc. Disengaged customers are more prone to churning.

Contract terms and plan details. Features like contract length remaining, plan type (prepaid/postpaid), bundled services availed etc. Expiring contracts increase renewal chances.

The data needs to be extracted from disparate systems, cleaned and consolidated into a single Customer Master File with all the attributes mapped to a single customer identifier. Data quality checks need to be performed to identify missing, invalid or outliers in the data.

The consolidated data needs to be analyzed to understand patterns, outliers, correlations between variables, and identify potential predictive features. Exploratory data analysis using statistical techniques like distributions, box plots, histograms, correlations will provide insights.

Customer profiles need to be segmented using clustering algorithms like K-Means to group similar customer profiles. Association rule mining can uncover interesting patterns between attributes. These findings will help understand the target variable of churn better.

For modeling, the data needs to be split into train and test sets maintaining class distributions. Features need to be selected based on domain knowledge, statistical significance, correlations. Highly correlated features conveying similar information need to be removed to avoid multicollinearity issues.

Various classification algorithms like logistic regression, decision trees, random forest, gradient boosting machines, neural networks need to be evaluated on the training set. Their performance needs to be systematically compared on parameters like accuracy, precision, recall, AUC-ROC to identify the best model.

Hyperparameter tuning using grid search/random search is required to optimize model performance. Techniques like k-fold cross validation need to be employed to get unbiased performance estimates. The best model identified from this process needs to be evaluated on the hold-out test set.

The model output needs to be in the form of churn probability/score for each customer which can be mapped to churn risk labels like low, medium, high risk. These risk labels along with the feature importances and coefficients can provide actionable insights to product and marketing teams.

Periodic model monitoring and re-training is required to continually improve predictions as more customer behavior data becomes available over time. New features can be added and insignificant features removed based on ongoing data analysis. Retraining ensures model performance does not deteriorate over time.

The predicted risk scores need to be fed back into marketing systems to design and target personalized retention campaigns at the right customers. Campaign effectiveness can be measured by tracking actual churn rates post campaign roll-out. This closes the loop to continually enhance model and campaign performance.

With responsible use of customer data, predictive modeling combined with targeted marketing and service interventions can help significantly reduce customer churn rates thereby positively impacting business metrics like customer lifetime value,Reduce the acquisition cost of new customers. The insights from this data driven approach enable companies to better understand customer needs, strengthen engagement and build long term customer loyalty.