sumy summarization
To apply summarization on grouped RTSK Worknote data for each unique RTSK Number, here’s how you could use the Sumy library to perform summarization after grouping by RTSK Number. In this example, we concatenate the worknotes for each RTSK Number into a single text block, apply summarization, and then store the result back in a summary column.
Step-by-Step Code
1. Install Sumy
First, install sumy if it’s not already installed:
2. Import Libraries and Define Summarization Function
3. Define the Summarization Function
This function uses Sumy’s LSA Summarizer (you can also choose other algorithms from Sumy) to summarize the concatenated worknotes for each RTSK Number.
4. Group by RTSK Number, Concatenate Worknotes, and Apply Summarization
Explanation of the Code
- Summarize Each Group: The code groups the data by
RTSK Number, concatenates allRTSK Worknotevalues in each group, and appliessummarize_textto generate a summary. - Merge Summary with Original Data: If you want each row to have its
RTSK Worknote Summaryin the original dataset, merge it back using theRTSK Number.
This will add a new column, RTSK Worknote Summary, with summarized comments for each RTSK Number.
No comments