ILO
ILOSTAT - The leading source of labour statistics
Login (ILO staff only)
Login (ILO staff only)
    ILO SURVEY CATALOGUE / Central Data Catalog / ZAF_2014_GHS_V01_M_ILO
central

General Household Survey 2014

South Africa, 2014
Reference ID
ZAF_2014_GHS_v01_M_ILO
Producer(s)
Statistics South Africa
Collections
Other household surveys
Metadata
DDI/XML JSON
Study website Interactive tools
Created on
Mar 30, 2017
Last modified
Mar 30, 2017
Page views
22864
Downloads
3393
  • Study Description
  • Data Description
  • Documentation
  • Identification
  • Version
  • Scope
  • Coverage
  • Producers and sponsors
  • Sampling
  • Data Collection
  • Questionnaires
  • Data Processing
  • Data Appraisal
  • Data access
  • Disclaimer and copyrights
  • Metadata production

Identification

Survey ID Number
ZAF_2014_GHS_v01_M_ILO
Title
General Household Survey 2014
Country
Name Country code
South Africa ZAF
Series Name
Other Household Survey [hh/oth]
Series Information
The GHS is an annual household survey conducted by Stats SA since 2002. The survey replaced the October Household Survey (OHS) which was introduced in 1993 and was terminated in 1999. The survey is an omnibus household based instrument aimed at determining the progress of development in the country. It measures, on a regular basis, the performance of programmes as well as the quality of service delivery in a number of key service sectors in the country. The GHS covers six broad areas, namely education, health and social development, housing, household access to services and facilities, food security, and agriculture.
Abstract
The GHS is an annual household survey specifically designed to measure the living circumstances of South African households. The GHS collects data on education, health and social development, housing, household access to services and facilities, food security, and agriculture. This report has three main objectives: firstly, to present the key findings of GHS 2014. Secondly, it provides trends across a thirteen year period, i.e. since the GHS was introduced in 2002; and thirdly, it provides a more in-depth analysis of selected service delivery issues. As with previous reports, this report will not include tables with specific indicators measured, as these will be included in a more comprehensive publication of development indicators, entitled Selected development indicators
Kind of Data
Sample survey data [ssd]
Unit of Analysis
The units of anaylsis for the General Household Survey 2014 are individuals and households.

Version

Version number
Version 01

Scope

Study notes
The scope of the General Household Survey 2014 includes:

Household characteristics: Dwelling type, home ownership, access to water and sanitation, access to services, transport, household assets, land ownership, agricultural production
Individuals' characteristics: demographic characteristics, relationship to household head, marital status, language, education, employment, income, health, fertility, disability, access to social services, mortality.
Topic Classification
Topic Vocabulary
Agriculture & Rural Development ILO
Education ILO
Environment ILO
Health ILO
Health Insurance ILO
Household Income ILO
Transport ILO
Water ILO
Employment ILO
Unemployment ILO
Gender ILO
Disability ILO

Coverage

Geographic Coverage
The General Household Survey 2014 had national coverage. The lowest level of geographic aggregation for this dataset is province.
Geographic Unit
The lowest level of geographic aggregations covered by the General Household Survey 2011 is Province.
Universe
The survey covers all de jure household members (usual residents) of households in the nine provinces of South Africa and residents in workers' hostels. The survey does not cover collective living quarters such as student hostels, old age homes, hospitals, prisons and military barracks.

Producers and sponsors

Authoring entity/Primary investigators
Agency Name Affiliation
Statistics South Africa South African Government

Sampling

Sampling Procedure
A multi-stage design was used in this survey, which is based on a stratified design with probability proportional to size selection of primary sampling units (PSUs) at the first stage and sampling of dwelling units (DUs) with systematic sampling at the second stage. After allocating the sample to the provinces, the sample was further stratified by geography (primary stratification), and by population attributes using Census 2001 data (secondary stratification).Survey officers employed and trained by Stats SA visited all the sampled dwelling units in each of the nine provinces. During the first phase of the survey, sampled dwelling units were visited and informed about the coming survey as part of the publicity campaign. The actual interviews took place four weeks later. A total of 25 363 households (including multiple households) were successfully interviewed during face-to-face interviews. Two hundred and thirty-three enumerators (233) and 62 provincial and district coordinators participated in the survey across all nine provinces. An additional 27 quality assurors were responsible for monitoring and ensuring questionnaire quality. National training took place over a period of four days. The national trainers then trained provincial trainers for five days at provincial level. They in turn provided district training to the survey officers for a period of six days.
Deviations from the Sample Design
The sample design for the GHS 2014 was based on a master sample (MS) that was originally designed for the Quarterly Labour Force Survey (QLFS) and was used for the first time for the GHS in 2008. This master sample is shared by the QLFS, GHS, Living Conditions Survey (LCS), Domestic Tourism Survey (DTS) and the Income and Expenditure Survey (IES).

The master sample used a two-stage, stratified design with probability-proportional-to-size (PPS) sampling of primary sampling units (PSUs) from within strata, and systematic sampling of dwelling units (DUs) from the sampled PSUs. A self-weighting design at provincial level was used and MS stratification was divided into two levels. Primary stratification was defined by metropolitan and non-metropolitan geographic area type. During secondary stratification, the Census 2001 data were summarised at PSU level. The following variables were used for secondary stratification: household size, education, occupancy status, gender, industry and income.

Census enumeration areas (EAs) as delineated for Census 2001 formed the basis of the PSUs. The following additional rules were used:
• Where possible, PSU sizes were kept between 100 and 500 DUs;
• EAs with fewer than 25 DUs were excluded;
• EAs with between 26 and 99 DUs were pooled to form larger PSUs and the criteria used was same settlement type;
• Virtual splits were applied to large PSUs: 500 to 999 split into two; 1 000 to 1 499 split into three; and 1 500 plus split into four PSUs; and
• Informal PSUs were segmented.

A randomised-probability-proportional-to-size (RPPS) systematic sample of PSUs was drawn in each stratum, with the measure of size being the number of households in the PSU. Altogether approximately 3 080 PSUs were selected. In each selected PSU a systematic sample of dwelling units was drawn. The number of DUs selected per PSU varies from PSU to PSU and depends on the Inverse Sampling Ratios (ISR) of each PSU.
Response Rate
Response rates per province, 2014

Province Per cent
Western Cape 93,1
Eastern Cape 96,9
Northern Cape 96,3
Free State 96,6
KwaZulu-Natal 96,3
North West 96,9
Gauteng 81,8
Mpumalanga 96,6
Limpopo 99,3
South Africa 93,7
The national response rate for the survey was 93,7%. The highest response rate (99,3%) was recorded in Limpopo and the lowest in Gauteng (81,8%).
Weighting
The sampling weights for the data collected from the sampled households were constructed so that the responses could be properly expanded to represent the entire civilian population of South Africa. The design weights, which are the inverse sampling rate (ISR) for the province, are assigned to each of the households in a province. These were adjusted for four factors: Informal PSUs, Growth PSUs, Sample Stabilisation, and Non-responding Units.

Mid-year population estimates produced by the Demographic Analysis division were used for benchmarking. The final survey weights were constructed using regression estimation to calibrate to national level population estimates cross-classified by 5-year age groups, gender and race, and provincial population estimates by broad age groups. The 5-year age groups are: 0–4, 5–9, 10–14, 55–59, 60–64; and 65 and older. The provincial level age groups are 0–14, 15–34, 35–64; and 65 years and older. The calibrated weights were constructed in such away that all persons in a household would have the same final weight.

The Statistics Canada software StatMx was used for constructing calibration weights. The population controls at national and provincial levels were used for the cells defined by crossclassification of Age by Gender by Race. Records for which the age, population group or sex had item non-response could not be weighted and were therefore excluded from the dataset. No imputation was done to retain these records.

Data Collection

Dates of Data Collection (YYYY/MM/DD)
Start date End date
2014 2014
Time periods (YYYY/MM/DD)
Start date End date
2014-01 2014-12
Mode of data collection
Face-to-face [f2f]
Supervision
Two hundred and thirty-three enumerators (233) and 62 provincial and district coordinators participated in the survey across all nine provinces. An additional 27 quality assurors were responsible for monitoring and ensuring questionnaire quality. National training took place over a period of four days. The national trainers then trained provincial trainers for five days at provincial level. They in turn provided district training to the survey officers for a period of six days.
Characteristics of Data Collection Situation - Notes on data collection
To changes to the questions, the data collection period has also changed since 2002. Between 2002 and 2008 data were gathered during July. The data collection period was extended to 3 months (July to September) between 2010 and 2012. As from 2013, the data collection period was extended to 12 months (January to December). Although the extension is not necessarily a limitation, it should be borne in mind when using the data for comparative purposes.

Questionnaires

Type of Research Instrument
The details of the questions included in the GHS questionnaire are covered in 10 sections, each focusing on a particular aspect. Depending on the need for additional information, the questionnaire is adapted on an annual basis. New sections may be introduced on a specific topic for which information is needed or additional questions may be added to existing sections. Likewise, questions that are no longer necessary may be removed.

A summary of the contents of the GHS 2014 questionnaire

Section Number of Details of each section
questions
Cover page Household information, response details, field staff information, result codes, etc.
Flap 6 Demographic information (name, sex, age, population group, etc.)
Section 1 41 Biographical information (education, health, disability, welfare)
Section 2 13 Health and general functioning
section 3 3 Social grants and social relief
Section 4 19 Economic activities
Section 5 59 Household information (type of dwelling, ownership of dwelling, electricity, water and sanitation, environmental issues, services, transport, etc.)
Section 6 11 Communication, postal services and transport
Section 7 15 Health, welfare and food security
section 8 28 Households Livelihoods (agriculture, household income sources and expenditure)
Section 9 7 Mortality in the last 12 months
Section 10 3 Questions to interviewers
All sections 202 Comprehensive coverage of living conditions and service delivery

The GHS questionnaire has undergone some revisions over time. These changes were primarily the result of shifts in focus of government programmes over time. The 2002–2004 questionnaires were very similar. Changes made to the GHS 2005 questionnaire included additional questions in the education section with a total of 179 questions. Between 2006 and 2008, the questionnaire remained virtually unchanged. For GHS 2009, extensive stakeholder consultation took place during which the questionnaire was reviewed to be more in line with the monitoring and evaluation frameworks of the various government departments. Particular sections that were modified substantially during the review were the sections on education, social development, housing, agriculture, and food security. Even though the number of sections and pages in the questionnaire remained the same, questions in the GHS 2009 were increased from 166 to 185 between 2006 and 2008. Following the introduction of a dedicated survey on Domestic Tourism, the section on tourism was dropped for GHS 2010. Due to a further rotation of questions, particularly the addition of a module on mortality, the GHS 2014 questionnaire contained 202 questions.

Data Processing

Cleaning Operations
Historically the GHS used a conservative and hands-off approach to editing. Manual editing, and little if any imputation was done. The focus of the editing process was on clearing skip violations and ensuring that each variable only contains valid values. Very few limits to valid values were set and data were largely released as they were received from the field. With GHS 2009, Stats SA introduced an automated editing and imputation system that was continued for GHSs 2010–2014. The challenge was to remain true, as much as possible, to the conservative approach used prior to GHS 2009, and yet, at the same time, to develop a standard set of rules to be used during editing which could be applied consistently across time. When testing for skip violations and doing automated editing, the following general rules are applied in cases where one question follows the filter question and the skip is violated:

• If the filter question had a missing value, the filter is allocated the value that corresponds with the subsequent question which had a valid value.
• If the values of the filter question and subsequent question are inconsistent, the filter question’s value is set to missing and imputed using either the hot-deck or nearest neighbour imputation techniques. The imputed value is then once again tested against the skip rule. If the skip rule remains violated, the question subsequent to the filter question is dealt with by either setting it to missing and imputing or, if that fails, printing a message of edit failure for further investigation, decisionmaking and manual editing.

In cases where skip violations take place for questions where multiple questions follow the filter question, the rules used are as follows:
• If the filter question has a missing value, the filter is allocated the value that corresponds with the value expected given the completion of the remainder of the question set.
• If the filter question and the values of subsequent questions values were inconsistent, a counter is set to see what proportion of the subsequent questions have been completed. If more than 50% of the subsequent questions have been completed, the filter question’s value is modified to correspond with the fact that the rest of the questions in the set were completed. If less than 50% of the subsequent questions in the set were completed, the value of the filter question is set to missing and imputed using either the hot-deck or nearest neighbour imputation techniques. The imputed value is then once again tested against the skip rule. If the skip rule remains violated the questions in the set that follows the filter question are set to missing.

When dealing with internal inconsistencies, as much as possible was done using logical imputation, i.e. information from other questions is compared with the inconsistent information. If other evidence is found to back up either of the two inconsistent viewpoints, the inconsistency is resolved accordingly. If the internal consistency remains, the question subsequent to the filter question is dealt with by either setting it to missing and imputing its value or printing a message of edit failure for further investigation, decision-making and manual editing. Two imputation techniques were used for imputing missing values: hot deck and nearest neighbour. In both cases the already published code was used for imputation. The variable composition of hot decks is based on a combination of the variables used for the Census (where appropriate), an analysis of odds ratios and logistic regression models. Generally, as in the QLFS system, the GHS adds geographic variables such as province, geography type, metro/non-metro, population group, etc. to further refine the decks. This was not done for Census 2001 and it is assumed that the reason for this is the differences in deck size and position for sample surveys as opposed to a multi-million record database.

Data Appraisal

Data Appraisal
Please note that DataFirst provides versioning at dataset and file level. Revised files have new version numbers. Files that are not revised retain their original version numbers. Any changes to files will result in the dataset having a new version number. Thus version numbers of files within a dataset may not match

Data access

Contact
Name Affiliation Email URI
Statistics South Africa South African Government [email protected] Link
Conditions
Public use data, available to all.
Citation requirement
Statistics South Africa. General Household Survey 2014 [dataset]. Version 1. Pretoria. Statistics South Africa [producer], 2015. Cape Town. DataFirst [distributor], 2015.
Contact
Name Affiliation Email URI
Statistics South Africa South African Government [email protected] Link

Disclaimer and copyrights

Disclaimer
The use of any data is subject to acknowledgement of Stats SA as the supplier and owner of copyright. Statistics South Africa (Stats SA) will not be liable for any damages or losses, except to the extent that such losses or damages are attributable to a breach by Stats SA of its obligations in terms of an existing agreement or to the negligence or wilful act or omissions of the Stats SA, its servants or agents, arising out of the supply of data and or digital products in terms of that agreement. The user indemnifies Stats SA against any claims of whatsoever nature (including legal costs) by third parties arising from the reformatting, restructuring, reprocessing and/or addition of the data, by the user.
Copyright
Copyright 2014, Statistics South Africa

Metadata production

Document ID
DDI_ZAF_2014_GHS_v01_M_ILO
Producers
Name Abbreviation Affiliation Role
Department of Statistics ILO International Labour Organization Producer of DDI
Date of Production
2017-03-30
Back to Catalog

© 1996 - 2026 International Labour Organization (ILO) | Copyright and permissions