*** Do-file as part of Communication Methods and Measures example *Date: July 17, 2017 *Appendix to: De Vreese, C. H., Boukes, M., Schuck, A. R. T., Vliegenthart, R., Bos, L., & Lelkes, Y. (forthcoming). Linking Survey and Media Content Data: Opportunities, Considerations and Pitfalls. Communication Methods and Measures. doi:10.1080/19312458.2017.1380175 *** open dataset: clear use "C:\Vidi\Linking\ContentAnalysis_Appendix.dta" drop Length drop if outlet == 4 drop if outlet == 7 drop if outlet == 10 *** Step 1 – Generating a variable indicating the wave: * Wave 1 commenced on day 54 of the year 2015; wave 2 on day 110; and wave 3 on day 166 gen PublishedBeforeWave = . replace PublishedBeforeWave = 1 if Day_in_2015 <= 53 replace PublishedBeforeWave = 2 if Day_in_2015 >= 54 & Day_in_2015 <= 109 replace PublishedBeforeWave = 3 if Day_in_2015 >= 110 & Day_in_2015 <= 165 replace PublishedBeforeWave = 4 if Day_in_2015 >= 166 tab PublishedBeforeWave *** Step 2– Generating a variable indicating the week (recency moderator): *** Weeks before the survey wave, reversed (more recent is higher weight) gen WeeksBeforeWave = . replace WeeksBeforeWave = 8 if Day_in_2015 >= 47 & Day_in_2015 <= 53 replace WeeksBeforeWave = 7 if Day_in_2015 >= 40 & Day_in_2015 <= 46 replace WeeksBeforeWave = 6 if Day_in_2015 >= 33 & Day_in_2015 <= 39 replace WeeksBeforeWave = 5 if Day_in_2015 >= 26 & Day_in_2015 <= 32 replace WeeksBeforeWave = 4 if Day_in_2015 >= 19 & Day_in_2015 <= 25 replace WeeksBeforeWave = 3 if Day_in_2015 >= 12 & Day_in_2015 <= 18 replace WeeksBeforeWave = 2 if Day_in_2015 >= 5 & Day_in_2015 <= 11 replace WeeksBeforeWave = 1 if Day_in_2015 >= 1 & Day_in_2015 <= 4 replace WeeksBeforeWave = 8 if Day_in_2015 >= 103 & Day_in_2015 <= 109 replace WeeksBeforeWave = 7 if Day_in_2015 >= 96 & Day_in_2015 <= 102 replace WeeksBeforeWave = 6 if Day_in_2015 >= 89 & Day_in_2015 <= 95 replace WeeksBeforeWave = 5 if Day_in_2015 >= 82 & Day_in_2015 <= 88 replace WeeksBeforeWave = 4 if Day_in_2015 >= 75 & Day_in_2015 <= 81 replace WeeksBeforeWave = 3 if Day_in_2015 >= 68 & Day_in_2015 <= 74 replace WeeksBeforeWave = 2 if Day_in_2015 >= 61 & Day_in_2015 <= 67 replace WeeksBeforeWave = 1 if Day_in_2015 >= 54 & Day_in_2015 <= 60 replace WeeksBeforeWave = 8 if Day_in_2015 >= 159 & Day_in_2015 <= 165 replace WeeksBeforeWave = 7 if Day_in_2015 >= 152 & Day_in_2015 <= 158 replace WeeksBeforeWave = 6 if Day_in_2015 >= 145 & Day_in_2015 <= 151 replace WeeksBeforeWave = 5 if Day_in_2015 >= 138 & Day_in_2015 <= 144 replace WeeksBeforeWave = 4 if Day_in_2015 >= 131 & Day_in_2015 <= 137 replace WeeksBeforeWave = 3 if Day_in_2015 >= 124 & Day_in_2015 <= 130 replace WeeksBeforeWave = 2 if Day_in_2015 >= 117 & Day_in_2015 <= 123 replace WeeksBeforeWave = 1 if Day_in_2015 >= 110 & Day_in_2015 <= 116 * Divide by 4.5 (i.e., 1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 = 36 ---- 36 /8 = 4.5 ) gen recencyfactor = WeeksBeforeWave / 4.5 *Alternative weighting: give double wait to the last 2 weeks preceding the survey gen Last2Weeks = WeeksBeforeWave replace Last2Weeks = (2/1.5) if WeeksBeforeWave == 7 | WeeksBeforeWave == 8 replace Last2Weeks = (1/1.5) if WeeksBeforeWave <=6 tab Last2Weeks *** Step 3 – Selecting the news items of interest: * Drop items that are not economic news keep if Economisch_Ja == 1 * Drop items that carry no evaluation of the Dutch economy: tab Posit_Nega tab Posit_Nega, nol drop if Posit_Nega == 9 drop if Posit_Nega == . tab Posit_Nega, nol *** Step 4 – Constructing the independent variables: gen tone = . replace tone = -2 if Posit_Nega == 0 replace tone = -1 if Posit_Nega == 1 replace tone = 0 if Posit_Nega == 2 replace tone = 0 if Posit_Nega == 3 replace tone = 1 if Posit_Nega == 4 replace tone = 2 if Posit_Nega == 5 tab tone regress tone Day_in_2015 gen negative_tone = . replace negative_tone = tone if tone < 0 replace negative_tone = 0 if tone >= 0 tab negative_tone gen positive_tone = . replace positive_tone = tone if tone > 0 replace positive_tone = 0 if tone <= 0 tab positive_tone *** Step 5 – Constructing the moderating variables: * with recency gen tone_x_recency = tone * recencyfactor gen tone_x_recency2weeks = tone * Last2Weeks * with prominence (multiply with length and divide by average length) sum length_automatic gen length_averaged = length_automatic / 448.2709 gen tone_x_length = (tone * length_automatic) / 448.2709 sum tone tone_x_recency tone_x_recency2weeks tone_x_length corr tone tone_x_recency tone_x_recency2weeks tone_x_length *** Step 6 – Aggregating dataset to outlet/wave level: tab outlet PublishedBeforeWave tab tone outlet if PublishedBeforeWave ==1, miss gen visibility = 1 collapse (count) visibility (sum) tone negative_tone positive_tone length_averaged recencyfactor Last2Weeks tone_x_recency tone_x_recency2weeks tone_x_length, by(outlet PublishedBeforeWave) *save the aggregated results in a separate file: export excel using "C:\Vidi\Linking\AggregatedContentAnalysis_Data_appendix.xls", sheet("AggregatedData") firstrow(variables) replace ***Clear the STATA environment, and open survey data clear use "C:\Vidi\Linking\Survey_Appendix.dta", clear ***Step 7 – Creating the media exposure independent variable: * Generate an empty media exposure variable, indicating how often respondents are exposed to all the outlets together * Newspapers are published six days per week, so original variables have to be recoded first so the maximum score 8 (7 days per week) and score 7 (6 days per week) are collapsed: sum V23a_1 V23a_2 V23a_3 V23a_4 V23a_5 V23a_6 V23a_7 V23a_10 recode V23a_1 V23a_2 V23a_3 V23a_4 V23a_5 V23a_6 V23a_7 V23a_10 (8=7) * generate a variable for every individual media outlet that runs from 0 (never read this) until 1 (read all issues of this newspaper). Original scale runs from 1 to 7. Therefore, first subtract 1 (making the scale run from 0 to 6) and then divide by 6. gen Telegraaf=(V23a_1 - 1) / 6 gen AlgemeenDagblad=(V23a_2 - 1) / 6 gen Volkskrant=(V23a_3 - 1) / 6 gen NRC=(V23a_4 - 1) / 6 gen Trouw=(V23a_6 - 1) / 6 gen FinancieelDagblad=(V23a_7 - 1) / 6 gen Metro=(V23a_10 - 1) / 6 sum Telegraaf AlgemeenDagblad Volkskrant NRC Trouw FinancieelDagblad Metro gen NewspaperExposure = Telegraaf + AlgemeenDagblad + Volkskrant + NRC + Trouw + FinancieelDagblad + Metro ***Step 8 – Preparing the dependent variable: sum W1_economicexpectationNL W2_economicexpectationNL W3_economicexpectationNL gen EconomicExpectationNL_Wave1 = W1_economicexpectationNL - 1 gen EconomicExpectationNL_Wave2 = W2_economicexpectationNL - 1 gen EconomicExpectationNL_Wave3 = W3_economicexpectationNL - 1 sum EconomicExpectationNL_Wave1 EconomicExpectationNL_Wave2 EconomicExpectationNL_Wave3 corr EconomicExpectationNL_Wave1 EconomicExpectationNL_Wave2 EconomicExpectationNL_Wave3 *** Step 8b: generate change scores: gen ChangeExpectation_Wave2 = EconomicExpectationNL_Wave2 - EconomicExpectationNL_Wave1 gen ChangeExpectation_Wave3 = EconomicExpectationNL_Wave3 - EconomicExpectationNL_Wave2 ***Step 9 – Creating media content weighted exposure variables * to fill in for every IV: *gen IV_BeforeWave1 = AlgemeenDagblad * XXX + Telegraaf * XXX + Volkskrant * XXX + FinancieelDagblad + XXX + Metro * XXX + NRC * XXX + Trouw * XXX *gen IV_BeforeWave2 = AlgemeenDagblad * XXX + Telegraaf * XXX + Volkskrant * XXX + FinancieelDagblad + XXX + Metro * XXX + NRC * XXX + Trouw * XXX *gen IV_BeforeWave3 = AlgemeenDagblad * XXX + Telegraaf * XXX + Volkskrant * XXX + FinancieelDagblad + XXX + Metro * XXX + NRC * XXX + Trouw * XXX gen Visibility_BeforeWave1 = AlgemeenDagblad * 24 + Telegraaf * 53 + Volkskrant * 18 + FinancieelDagblad * 61 + Metro * 6 + NRC * 27 + Trouw * 20 gen Visibility_BeforeWave2 = AlgemeenDagblad * 57 + Telegraaf * 100 + Volkskrant * 55 + FinancieelDagblad * 96 + Metro * 10 + NRC * 67 + Trouw * 45 gen Visibility_BeforeWave3 = AlgemeenDagblad * 54 + Telegraaf * 108 + Volkskrant * 59 + FinancieelDagblad * 103 + Metro * 2 + NRC * 62 + Trouw * 43 gen Tone_BeforeWave1 = AlgemeenDagblad * -4 + Telegraaf * -4 + Volkskrant * 3 + FinancieelDagblad + -2 + Metro * 0 + NRC * -10 + Trouw * 0 gen Tone_BeforeWave2 = AlgemeenDagblad * 5 + Telegraaf * 19 + Volkskrant * 13 + FinancieelDagblad + 28 + Metro * 7 + NRC * 4 + Trouw * 0 gen Tone_BeforeWave3 = AlgemeenDagblad * 17 + Telegraaf * 10 + Volkskrant * 37 + FinancieelDagblad + 28 + Metro * 0 + NRC * 8 + Trouw * 7 gen Tone_x_Recency_BeforeWave1 = AlgemeenDagblad * -9.777777672 + Telegraaf * -8.222222209 + Volkskrant * 6.444444418 + FinancieelDagblad + -5.555555463 + Metro * -0.222222209 + NRC * -16.00000024 + Trouw * 1.111111045 gen Tone_x_Recency_BeforeWave2 = AlgemeenDagblad * 6.888888791 + Telegraaf * 16.22222219 + Volkskrant * 5.111110911 + FinancieelDagblad + 30.66666743 + Metro * 4.000000089 + NRC * 2.444444567 + Trouw * -3.111111119 gen Tone_x_Recency_BeforeWave3 = AlgemeenDagblad * 24.22222275 + Telegraaf * 14.22222239 + Volkskrant * 39.11111175 + FinancieelDagblad + 35.11111186 + Metro * 1.333333358 + NRC * 16.2222223 + Trouw * 12.6666669 gen Tone_x_Recency2weeks_BeforeWave1 = AlgemeenDagblad * -15 + Telegraaf * -5.25 + Volkskrant * 7.5 + FinancieelDagblad + -4.5 + Metro * -0.75 + NRC * -11.25 + Trouw * 1.5 gen Tone_x_Recency2weeks_BeforeWave2 = AlgemeenDagblad * 3.75 + Telegraaf * 15.75 + Volkskrant * 11.25 + FinancieelDagblad + 27.75 + Metro * 4.5 + NRC * -3 + Trouw * -5.25 gen Tone_x_Recency2weeks_BeforeWave3 = AlgemeenDagblad * 18 + Telegraaf * 10.5 + Volkskrant * 36 + FinancieelDagblad + 34.5 + Metro * 0.75 + NRC * 12.75 + Trouw * 9.75 gen Tone_x_Length_BeforeWave1 = AlgemeenDagblad * -5.472137377 + Telegraaf * -7.943856932 + Volkskrant * -2.195101947 + FinancieelDagblad + -11.65366736 + Metro * 1.820327848 + NRC * -20.28014734 + Trouw * -6.955615908 gen Tone_x_Length_BeforeWave2 = AlgemeenDagblad * 2.60110566 + Telegraaf * 16.84249357 + Volkskrant * 3.388576105 + FinancieelDagblad + 29.94617756 + Metro * 4.202815644 + NRC * -29.54909584 + Trouw * -1.516940117 gen Tone_x_Length_BeforeWave3 = AlgemeenDagblad * 11.19189306 + Telegraaf * -14.35069688 + Volkskrant * 38.92958499 + FinancieelDagblad + 2.043407455 + Metro * -0.506390214 + NRC * -19.31867474 + Trouw * 0.479620695 *** Step 10 – Reshape the dataset into a long format (units of analysis became respondent / wave combinations) * Reshape dataset into long format reshape long EconomicExpectationNL_Wave ChangeExpectation_Wave Visibility_BeforeWave Tone_BeforeWave Tone_x_Recency_BeforeWave Tone_x_Recency2weeks_BeforeWave Tone_x_Length_BeforeWave , i(volgnr) j(Wave) xtset volgnr Wave tab Wave xtdescribe egen z_Wave = std(Wave) *** Step 10b: generate z-standardized independent variables to later assess standardized effects not calculated automatically in a fixed effect model egen z_EconomicExpectationNL_Wave = std(EconomicExpectationNL_Wave) egen z_ChangeExpectation_Wave = std(ChangeExpectation_Wave) egen z_NewspaperExposure = std(NewspaperExposure) egen z_Visibility_BeforeWave = std(Visibility_BeforeWave) egen z_Tone_BeforeWave = std(Tone_BeforeWave) egen z_Tone_x_Recency_BeforeWave = std(Tone_x_Recency_BeforeWave) egen z_Tone_x_Recency2W_BeforeWave =std(Tone_x_Recency2weeks_BeforeWave) egen z_Tone_x_Length_BeforeWave = std(Tone_x_Length_BeforeWave) sum z_NewspaperExposure z_Visibility_BeforeWave z_Tone_BeforeWave z_EconomicExpectationNL_Wave z_ChangeExpectation_Wave z_Tone_x_Recency_BeforeWave z_Tone_x_Recency2W_BeforeWave z_Tone_x_Length_BeforeWave ***Step 11 – Analyze correlation between dependent variable in different waves: corr EconomicExpectationNL_Wave l.EconomicExpectationNL_Wave l2.EconomicExpectationNL_Wave * Correlations between .6 and .7; so, lagged DV models are most approriate *** Step 12 – Check for issues of multicollinearity: *creating total scores * assess the correlations: corr NewspaperExposure Visibility_BeforeWave Tone_BeforeWave Tone_x_Recency_BeforeWave Tone_x_Recency2weeks_BeforeWave Tone_x_Length_BeforeWave ***Step 13 – Step-by-step analysis plan ** Model 1: Lagged dependent variable only regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave, cluster(volgnr) *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave, cluster(volgnr) ** Model 2: Lagged dependent variable only + self-reported exposure regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave NewspaperExposure, cluster(volgnr) *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_NewspaperExposure, cluster(volgnr) ** Model 3a self-reported exposure + visibility regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave NewspaperExposure Visibility_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_NewspaperExposure z_Visibility_BeforeWave, cluster(volgnr) ** Model 3b without self-reported exposure regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave Visibility_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_Visibility_BeforeWave, cluster(volgnr) * Model 4a including tone (and visibility): corr Visibility_BeforeWave Tone_BeforeWave regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave Visibility_BeforeWave Tone_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_Visibility_BeforeWave z_Tone_BeforeWave, cluster(volgnr) * Model 4b including tone (no visibility): regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave Tone_BeforeWave, cluster(volgnr) *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_Tone_BeforeWave, cluster(volgnr) * Model 4c including tone, and visibility, and self-reported exposure: regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave NewspaperExposure Visibility_BeforeWave Tone_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_NewspaperExposure z_Visibility_BeforeWave z_Tone_BeforeWave, cluster(volgnr) * Model 4d: Change model: regress ChangeExpectation_Wave Wave Visibility_BeforeWave Tone_BeforeWave, cluster(volgnr) *standardized coefficients: regress z_ChangeExpectation_Wave z_Wave z_Visibility_BeforeWave z_Tone_BeforeWave, cluster(volgnr) ***Model 5: variations of tone *Model 5a: basic models without weights regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave Visibility_BeforeWave Tone_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_Visibility_BeforeWave z_Tone_BeforeWave, cluster(volgnr) * Model 5b effect of tone weighted by recency regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave Visibility_BeforeWave Tone_x_Recency_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_Visibility_BeforeWave z_Tone_x_Recency_BeforeWave, cluster(volgnr) * Model 5c effect of tone weighted by recency of 2 weeks regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave Visibility_BeforeWave Tone_x_Recency2weeks_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_Visibility_BeforeWave z_Tone_x_Recency2W_BeforeWave, cluster(volgnr) *Model 5d: effect of tone weighted by prominence regress EconomicExpectationNL_Wave Wave l.EconomicExpectationNL_Wave Visibility_BeforeWave Tone_x_Length_BeforeWave, cluster(volgnr) vif *standardized coefficients: regress z_EconomicExpectationNL_Wave z_Wave l.z_EconomicExpectationNL_Wave z_Visibility_BeforeWave z_Tone_x_Length_BeforeWave, cluster(volgnr) *** Step 13a –Fixed effect models: *Model 6a: Effects of visibility only xtreg EconomicExpectationNL_Wave Visibility_BeforeWave, fe * standardized effects: xtreg z_EconomicExpectationNL_Wave z_Visibility_BeforeWave, fe *Model 6b: Effects of visibility and tone xtreg EconomicExpectationNL_Wave Visibility_BeforeWave Tone_BeforeWave, fe * standardized effects: xtreg z_EconomicExpectationNL_Wave z_Visibility_BeforeWave z_Tone_BeforeWave, fe *** Step 14 – Random effect models (when possible): Check whether random effects models are approriate: xtreg EconomicExpectationNL_Wave Visibility_BeforeWave Tone_BeforeWave, fe estimate store fixed1 xtreg EconomicExpectationNL_Wave Visibility_BeforeWave Tone_BeforeWave, re estimate store random1 hausman fixed1 random1 * hausman test is significant: so, random effects should not be conducted. xttest0 * Breusch and Pagan Lagrangian multiplier test for random effects: Significant, so random effects exist *** End: *Bring data back to wide format drop z_Wave reshape wide EconomicExpectationNL_Wave z_EconomicExpectationNL_Wave ChangeExpectation_Wave z_ChangeExpectation_Wave Visibility_BeforeWave Tone_BeforeWave z_Visibility_BeforeWave z_Tone_BeforeWave Tone_x_Recency_BeforeWave z_Tone_x_Recency_BeforeWave Tone_x_Recency2weeks_BeforeWave z_Tone_x_Recency2W_BeforeWave Tone_x_Length_BeforeWave z_Tone_x_Length_BeforeWave , i(volgnr) j(Wave)