Lessons learnt from a multi-country survey
Here at Stats4SD, we run various types of studies including those across multiple countries and sectors. Each comes with its own triumphs and challenges, and as part of our continued evaluating and personal learnings that we ensue, Alex has decided to turn his latest findings into this month’s blog feature! The project he worked on entailed four different official languages, three different alphabets, as well as two different forms of data collection - and has therefore created lots of interesting learning opportunities. Alex talks us through the key ones in what has been a very exciting project for us!
2017 was the final year of a five-year project aimed to improve the health of the population in targeted areas of four Central Asian countries, particularly women in reproductive age, new-borns and children under five. The intervention areas were mainly mountainous, and usually physically isolated. Therefore, communities in these areas were often the poorest within their countries; they are marginalised from national, social, economic, and political developments, and have experienced perpetual under-investment in the health sector.
Stats4SD was the global consultant in the 2017 end-line survey. The purpose of the study was to design and coordinate a robust survey in order to demonstrate, when possible, change against core indicators with respect to the 2014 baseline survey. We were in charge of:
- The final version of the questionnaires;
- Presenting workshops on questionnaire and survey design;
- Data entry;
- Data tabulation;
- Training enumeration coordinators in the use of the survey tools and data collection methodology;
- Additional assistance consisting of advice on sample size determination, sampling design and implementation and data cleaning.
The design and coordination of the survey, which was carried out in multiple countries, languages, alphabets, as well as two different data collection methods (paper questionnaire and the use of mobile devices) posed a number of interesting challenges, and when completed, many lessons were learnt. Below are the most relevant ones.
Since it is expensive to conduct a survey, there is a tendency to add as many modules and questions to it, because they may be useful, either for the current objectives or for future purposes, and even some questions that may never be analysed. Therefore, questionnaires tend to be too long, which compromises the quality of the data collected. Respondents become tired during long interviews (of an hour or more), and stop listening and responding to the questions carefully as a result. The obvious lesson learnt is that questionnaires should include all the questions relevant to the study, but only these. The not so obvious lesson learnt is that questionnaires should be designed only when the tabulation plan (the document with all the model tables and results to be included in the report), is completed. Moreover, it is advisable to draft the report before the questionnaire is considered final.
A second step is needed to further refine the questionnaire: piloting. This action will improve the wording of questions, the translation and response options. It will also be useful for planning, as piloting will provide an estimate of the time needed to complete an interview. Piloting is an exercise to test the tool, and should not be confused with the enumerator’s field exercise.
Finally, backtranslation will ensure that there are no losses in translation.
Using the same questionnaire for four different countries poses a series of challenges, as different teams may have different expectations or needs, making it difficult to agree on a final set of questions. A fine balance between the country specificities and the need for standardisation is required.
Data Collection using mobile devices
The advantages – and disadvantages of using hand-held devices to collect data are well known. I would like, however, to illustrate one feature that shows how powerful the use of hand-held devices can be when collecting data. Mobile phones and tables have GPS location, and this function can be used to map the progress of the survey. The use of such maps is not only visually appealing, but it also provides useful information for the remote monitoring of the survey.
Figure 1: Map of one of the program intervention areas. The coloured areas indicate the sampled villages.
Figure 2: Map of one of the sampled villages. The dots indicate the sampled households.
Finally, we used mobile devices to obtain the household listings as well. The possibility to see the households listed and selected for interviewing was of great help to the enumerators’ teams.
Very soon into the process, we became aware that in an area where data was collected at baseline level, there was no intervention at all. The area accounted for, roughly, one-fifth of the data in the country. At first, we decided to re-run the baseline analysis removing all the households in the area, but we soon realised that the baseline analysis was like a black box; we could replicate the full analysis but we could not customise it to fulfil the new information needs. We could, of course, re-program the whole analysis, but this would have been lengthy and costly. The lesson learnt here is the importance of storing and securing original raw data sets and computations associated with them - as well as documenting them properly.
Wow, what a great deal of insightful learnings were acquired! Have you encountered a multi-country study with similar learnings? Or have yours been very different..? Is this due to the nature of the study or the countries surveyed? Whatever your story, we’d be interested in hearing from you and what your experience has been. Stay tuned and continue to join us next month where we’ll be looking at another study and sharing our insights again. Until then ?
Author: Alex Riba
Alex is an engineer with over 20 years of experience teaching statistics and conducting research. Having worked as a statistician on a wide range of projects, he is particularly interested in processes that let data speak for itself, especially in meaningful ways for non-statisticians.