# Tutorial 3: Creating a radar plot for custom word categories in a corpus of Dutch fairytales

![StoryNavigator Logo](../../doc/widgets/images/storynavigator_logo_small.png)

---
This tutorial is part of a series demonstrating the use of StoryNavigator widgets. These tutorials show how to use StoryNavigator widgets with other pre-existing widgets available within the Orange platform, and how to generate output via tables or figures. Each tutorial addresses a research question related to the narrative structure and contents of the corpus of stories.
---

### Step 0: Research question
In this tutorial, you will learn how to create a radar plot based on custom word categories in a corpus of Dutch fairytales. We will use a predefined Orange workflow to answer the following research question derived from Andrade & Andersen (2020):

- How do different categories of custom words distribute across a selected story?

We use the following workflow:

![Workflow](../../doc/widgets/images/radarplot_based_on_custom_work_list.jpg)

This workflow can be downloaded [here](https://github.com/navigating-stories/orange-story-navigator/tree/master/doc/widgets/workflows), and it uses a dataset of Dutch fairytales which can be found [here](https://github.com/navigating-stories/orange-story-navigator/tree/master/doc/widgets/fairytales/).

### Step 1: Load the corpus
To begin, load the corpus of Dutch fairytales in tab format using the **Corpus** widget. This widget allows you to import your dataset. Use the **Corpus Viewer** to inspect the dataset.

- **Task**: Load your dataset (Dutch fairytales) for visual inspection.
- **Outcome**: You will be able to visually inspect the text and ensure that your dataset is loaded correctly with the **Corpus Viewer**.

### Step 2: Load the custom word list
Next, load the custom word list with Dutch verbs inspired by Halliday (2004) using the **File** widget which connects to the **Data Table** widget. The file `dutch_halliday_action_list.csv` can be found  [here](../../orangecontrib/storynavigation/resources). The table contains verbs and their predefined categories.

![Workflow](../../doc/widgets/images/halliday_table.png)


- **Task**: Import the custom word list by Halliday.
- **Outcome**: The custom word list is loaded and ready for categorization.

### Step 3: Extract story elements
Connect the **Corpus** and the **File** widgets to the **Elements** widget to extract elements from the text. Select the appropriate word column in the elements menu to extract the custom words. 

- **Task**: Extract story elements and their categories for further analysis.
- **Outcome**: Access to categorized story elements.

### Step 4: Analyze actors
Use the **Actors** widget to directly observe custom tokens (e.g., verbs) highlighted in the texts.

- **Task**: Identify and observe custom tokens in the text.
- **Outcome**: Insight into categorized words in the text, focusing on verbs.

### Step 5: Group data
Use the **Group By** widget to group the data by story ID and Halliday category, then count the occurrences with *sum* (i.e., the freq variable).

- **Task**: Group data by story ID and category.
- **Outcome**: Frequency counts of categories for each story or segment.

### Step 6: Select Rows for Plotting
Select specific categories/ dimensions for plotting using the **Select Rows** widget.

- **Task**: Choose categories to focus on for the radar plot.
- **Outcome**: Refined data selection for plotting purposes.

### Step 7: Edit Domain
Change the variable type of the selected category for better plotting outcomes using the **Edit Domain** widget.

- **Task**: Adjust variable types for plotting.
- **Outcome**: Optimized variable types for radar plot creation.

### Step 8: Create a Radar Plot
Connect the data to a **Python Script** widget to generate a radar plot comparing the distribution of categories from the custom word list for a selected story. Note that this radarplot is not avaialbe as a standard orange widget. However, the **Python Script** widget allows you to add custom-made figures or analyses into your workflow. Copy-paste the following script for the radarplot in the editor of the **Python Script** widget:

```python
import matplotlib.pyplot as plt
import numpy as np

# Initialize the outputs
out_data = in_data
out_learner = None
out_classifier = None
out_object = None

if in_data is not None and len(in_data) > 0:  # Ensure there is data to plot
    try:
        # Extract 'category' and 'freq-sum' columns from the input data
        storyid = in_data.get_column(in_data.domain["storyid"])[0]  # Assuming storyid is the same for all rows
        categories = in_data.get_column(in_data.domain["category"])
        freq_sum = in_data.get_column(in_data.domain["freq - Sum"])
        
        # Convert categories to string
        categories = [str(cat) for cat in categories]
        
        # Number of categories
        N = len(categories)
        
        # Compute the angles for the polar bars
        angles = np.linspace(0, 2 * np.pi, N, endpoint=False).tolist()
        angles += angles[:1]  # Complete the circle for the radar plot

        # Repeat the first value to close the loop
        freq_sum = freq_sum.tolist() + [freq_sum[0]]

        # Initialize polar bar plot
        fig = plt.figure(figsize=(8, 8))
        ax = fig.add_subplot(111, polar=True)

        # Assign different colors for each bar using a colormap
        colors = plt.cm.viridis(np.linspace(0, 1, N))

        # Plot bars on the polar plot with different colors
        bars = ax.bar(angles, freq_sum, width=0.4, color=colors, alpha=0.7)

        # Set the category names as the x-ticks
        ax.set_xticks(angles[:-1])  # Exclude the last angle (repeat of the first one)
        ax.set_xticklabels(categories)

        # Optional: Customize y-axis labels or grid
        ax.yaxis.grid(True)
        
        # Display the storyID below the plot
        plt.figtext(0.5, 0.01, f'StoryID: {storyid}', ha='center', va='center', fontsize=12, color='black')

        # Display the polar bar plot
        plt.show()

    except KeyError as e:
        print(f"Error: Column '{e}' not found in the data. Check the column names.")
```

- **Task**: Generate a radar plot.
- **Outcome**: Visual representation of the distribution of categories.

Similar to the plot made in Andrade & Anderson (2020) paper, this generates the following radar plot for the selected story:

![StoryNavigator Logo](../../doc/widgets/images/story_radarplot.png)

### Conclusion
By following these steps, you'll be able to create a radar plot for custom word categories in the corpus of Dutch fairytales, helping you understand how specific categories of words distribute within and across selected stories. This allows you to compare the narrative structure and contents between stories.

### References

- Andrade, S.B. and Andersen, D., (2020). Digital story grammar: a quantitative methodology for narrative analysis. *International Journal of Social Research Methodology, 23*(4), pp.405-421. https://doi.org/10.1080/13645579.2020.1723205

- Halliday, M. A. K. (2004). *An introduction to functional grammar*. London, UK: Routledge