Workshop #2

Survey Analytics from Questionnaires and Textual Social Media Analytics.  With Accompanying Practical Sessions, examples and case studies in English and other languages.

  • Name of instructor:
    • Prof. Fionn Murtagh,Big Data Lab, University of Derby; and Goldsmith University of London.
    • Associate Professor Mohsen Farid,Big Data Lab, University of Derby.
  • Short description:
    The work of the celebrated social scientist Pierre Bourdieu (1930-2002) includes the thoughtful and creative use of Correspondence Analysis, published in English in 1984, with title Distinction.  It is on such a geometric data analysis approach that this course is based.
    The focus is: (1) interpretation of results, graphical displays and other outputs, (2) practical implementation using the R statistical and visualization environment, and (3) providing intuition, and full understanding, relating to the geometry and statistical processing.    We use data collected in various questionnaires, starting from work by Bourdieu on cultural taste.   Other questionnaire analysis case studies will be related to transport, cooking and lifestyle, student experience, consumer behavior, and music appreciation.
    Next the questionnaire outcomes express both closed, fixed format questions, and, conjointly analyzed, free text responses.
    Finally studied will be data sourced from social media micro-blogging, i.e. Twitter.
  • Syllabus
    The course uses the R programming and visualization language
    In accompanying online course materials, there will be a practical introduction to the R language and environment.  This is for participants who have not used R before.
    Part 1: Questionnaire analysis case study: taking the Bourdieu taste data, detailed discussion of output, detailing the R code used.
    Part 2: Geometric intuition: the methodology used for graphical display, hierarchical clustering, and putting it all together.
    Part 3: Carrying out geometric data analysis, including clustering, using R.   Including publication/presentation outputs, storing data for later work, and maintaining the R scripts that are used.
    Part 4: Further case studies of questionnaire analysis.
    Part 5: Questionnaire analysis, using conjoint, or integrally related, analysis of closed questions, and open or free text questions.
    Part 6: Coverage of social media data sources, will be especially centered on Twitter.All sessions will be associated with practical exercises, using case studies.
    Final Part: Concluding short debate and discussion on potential and scope for analytics, and statistical treatment of data, and text mining.
  • Target Audience: Practitioners and researchers related to any domains that are encompassed in the case studies, and practical exercises.   Students who are undertaking, or who are planning to undertake, any and all such work.Domains of general relevance include:
    Health and medical surveys, marketing, Security and forensics, Information and data sourcing through web-based questionnaires, Lifestyle and well-being analytics, Legal studies, Political studies, Language and literature, digital humanities.The presentation language of the short course is English.  Case studies will also be in English as well, however issues related to other languages such as Arabic may be addressed.
  • Facilities Required

    • Computers for participants.  Course participants’ own laptops are also feasible (with the complete software environment).
    • Software:  R, open source and openly available with pertinent toolboxes as required, for all computer platforms.
    • Course Material.   All course materials, including the data and examples of software use for the case studies, will be made available for course participants, on a password protected web site.