Skip to content
Home » Blog » How to collect data

How to collect data

While I was hanging out with my friends at a coffee shop, one of them asked if I could analyze his business. Some of the goals of the data analysis he requested were to evaluate the weaknesses of his business and determine whether he could optimize sales of his products. I replied, “Sure, it’s possible if the data is ready—do you have the data or not?” and he answered that he had never kept any records. I explained that he needs to have the data available before conducting an analysis; otherwise, he must collect it first.

Sometimes situations arise where our clients don’t have any data to analyze. So how do you analyze data if there isn’t anything to analyze? In general, if a company has never conducted an analysis before, the first step is to hire a data analyst to collect the data; once there is enough data, the analysis can begin. In this article, I’ll explain how to obtain data for analysis purposes.

Determine the purpose of your analysis

Before you begin the data collection process, you first need to determine the purpose of the analysis you will be conducting. You can start by writing down the objectives of your research and explaining why it is important to conduct this research.

Next, once you know the purpose of the analysis you’re going to do, decide what kind of data will support your analysis. Based on their nature/characteristics, data can be grouped into two categories:

  • We define quantitative data as data that uses numerical values and that we can measure. Researchers typically analyze this data using statistical methods.
  • Quantitative Data: This refers to data characterized by categories, labels, and descriptions. In general, we analyze this data by grouping and interpreting it.

What is data collection?

The data collection process is a systematic process for gathering, measuring, and analyzing information from various sources in order to gain a deep understanding of a specific issue. The data collection process is crucial because, with sufficient data, we can answer the questions we seek to address.

Data collection methods

Data collection methods include various techniques and tools for gathering quantitative and qualitative data. Based on the source of the data, there are basically two types of data: primary data and secondary data.

Primary data

Researchers collect this information directly from the source in accordance with the research objectives. This primary data is up-to-date, relevant, and highly suited to the needs of your research. Researchers should use this method if they want to obtain data that is highly relevant to a specific research topic, as it produces data with the following characteristics:

  • Accurate
  • Suitable for research needs
  • Researchers can control the data quality

Primary data collection methods

There are several methods for collecting primary data; the following are some of the most commonly used methods.

  • Interview: An interview is a two-way interaction between an interviewer and a respondent aimed at gathering information. The advantage of this method is that it allows us to gain information directly from the source; it is suitable for research focusing on behavioral studies.
  • Questionnaire: A questionnaire is a data collection method that involves presenting pre-prepared questions to respondents. The advantage of this method is that it is more effective for collecting data on a large scale. This method is suitable for conducting market research.
  • Experiment: An experiment is a structured and controlled procedure that is conducted to test a hypothesis. The advantage of this method is that it can produce highly precise results. Researchers commonly use this method in drug trials and in evaluating learning methods.
  • Observation: Observation is the process of collecting data through direct observation. This method has the benefit of providing real-time data. It is commonly used in traffic density analysis and field research.

Secondary data

The next type of data is secondary data, which is data that has already been collected by others and is ready for analysis. Secondary data is ideal for those learning to become data analysts. As it saves time and money, and the available datasets are generally comprehensive and large in size. Here are some common sources of secondary data used for analysis.

  • Government Publications: Government agencies typically publish the data they collect; for example, if you want to access poverty data, you can look it up on the BPS website.
  • Journals and papers: Some journals provide a variety of statistical data in their articles, which researchers use to support their research.
  • Online websites: Generally, the data used for learning AI/ML comes from websites that provide data; the most commonly used website for finding datasets is Kaggle.

Also read: Bayes’ theorem