Up: 011_data-and-disc-lect Prev: week2-what-is-discourse-analysis Next: week4-language-as-data-2
Language as Data 1
Online Lecture Notes:
-
Overview of “Data”
- Etymology and Definition
- Derived from the Latin word datum (something given).
- Defined by the Oxford English Dictionary as collective information, often numerical, used for analysis, reference, or computation.
- In computing: quantities, characters, or symbols processed collectively.
- Public Perception of Data
- Often linked to concepts like big data, AI, and digitization.
- Popular images depict data as abstract, impersonal, and intimidating.
- However, these representations often exaggerate the power of data systems.
- Etymology and Definition
-
Algorithms and Their Relationship to Data
- Definition:
- A set of rules or procedures for calculation or problem-solving.
- Algorithms are implemented in programming languages (e.g., Python, Java) and executed as software components.
- Interaction with Data:
- Algorithms process input data to generate output, forming the backbone of software systems.
- Definition:
-
Contextualizing Data in Social and Technical Discourses
- Two Approaches to Conceptualizing Data:
- Traditional View:
- Data as “given” an objective, forming the basis for knowledge and decision-making.
- Criticism: Data collection always involves selection and contextualization, making it a subjective process.
- Critical Perspective:
- Data as “reinterpretation” or formalization of complex phenomena.
- ISO/IEC defines data as a formalized representation suitable for communication and processing.
- Traditional View:
- Implications:
- Data is not neutral but shaped by social, cultural, and institutional contexts.
- Collecting and interpreting data is a constructive process, influenced by perspectives and interests.
- Two Approaches to Conceptualizing Data:
-
Critical Data Studies
- Key Points:
- Data reflects and reinforces power structures (e.g., corporate control, algorithmic biases).
- Researchers emphasize that data is “cooked,” not “raw,” shaped by selection and recontextualization.
- Examples of biased data systems:
- Gender and racial biases in facial recognition.
- Misguided word associations in language models (e.g., Man is to Computer Programmer as Woman is to Homemaker)
- Algorithms optimized for efficiency, often ignoring individual and societal impacts.
- Key Points:
-
Applications and Challenges
- Data in Journalism and Politics
- Data journalism analyzes structured information to uncover newsworthy stories (e.g., emergency services coverage, election analytics).
- Data-driven campaigns, such as Barack Obama’s election strategy, illustrate the growing influence of big data in politics.
- The misuse of data (e.g., disinformation) highlights its role as a form of power.
- Ethical Concerns:
- The need for iterative improvements and debugging of data systems.
- Questions about fairness, transparency, and accountability in algorithmic decision-making.
- Challenges in identifying and mitigating unintended harms caused by data systems.
- Data in Journalism and Politics
-
Key Quotes and Insights
- “Data is always reinterpreted information”
- Algorithms often prioritize efficiency and profitability over individual well-being and societal health.
- “Data are never raw but always cooked” — emphasizes the constructed nature of data.
-
Closing Thoughts
Lecture Notes
-
Two paths of conceptualisation of “data” as a term:
- Data is something given that can be captured and further processed. Data does not have a meaning by itself.
- Criticique: Data gets always selected, perspectivated and contextualised when collected.
- Data is something that can be obtained by reduction, selection or formalization from unites with a higher degree of complexity. Data is always cooked, and is never raw.
- Data is something given that can be captured and further processed. Data does not have a meaning by itself.
- IT perspective:
- A reinterpretable representation of information in a formalized manner, suitable for communication, interpretation, or processing. Examples of data include a sequence of bits, a table of numbers, the characters on a page, the recording of sounds made by a person speaking, or a moon rock specimen.
- In Digital Discourse Analysis:
- A datum is a phenomenon interpreted as a sign, which in the course of a research process is extracted from a given complexion and recontextualized.
-
Critical Data Studies
- “Data as a form of power”
- Controlling data
- Manipulating data
- Gathering Data
- “Data as a form of power”