Pysummarization

 Got it! Here’s how you can group by RTSK Number and create summaries for each RTSK Number in your dataframe df using the pysummarization library.

First, make sure you’ve got pandas and pysummarization installed:

sh
pip install pandas pysummarization

Then, here’s a script to get you started:

python
import pandas as pd
from pysummarization.nlpbase.auto_abstractor import AutoAbstractor
from pysummarization.tokenizabledoc.simple_tokenizer import SimpleTokenizer
from pysummarization.abstractabledoc.top_n_rank_abstractor import TopNRankAbstractor

# Your sample dataframe
data = {
    'RTSK Number': ['001', '001', '002', '002'],
    'RTSK Worknote': [
        'Worknote 1 for RTSK 001.',
        'Worknote 2 for RTSK 001.',
        'Worknote 1 for RTSK 002.',
        'Worknote 2 for RTSK 002.'
    ]
}
df = pd.DataFrame(data)

# Initialize summarization components
auto_abstractor = AutoAbstractor()
auto_abstractor.tokenizable_doc = SimpleTokenizer()
auto_abstractor.delimiter_list = [".", "\n"]
abstractable_doc = TopNRankAbstractor()

def summarize_comments(comments):
    combined_comments = ' '.join(comments)
    result_dict = auto_abstractor.summarize(combined_comments, abstractable_doc)
    summary = ' '.join(result_dict["summarize_result"])
    return summary

# Group by RTSK Number and summarize
summarized_df = df.groupby('RTSK Number')['RTSK Worknote'].apply(summarize_comments).reset_index()
summarized_df.columns = ['RTSK Number', 'Summary']

print(summarized_df)

This will give you a dataframe with RTSK Number and their corresponding Summary. This example joins the comments for each RTSK Number and summarizes them.


---------------------------------------------------------------------------------------------------------------

def summarize_comments(comments):

    # Convert all comments to strings

    comments = [str(comment) for comment in comments]

    combined_comments = ' '.join(comments)

    result_dict = auto_abstractor.summarize(combined_comments, abstractable_doc)

    summary = ' '.join(result_dict["summarize_result"])

    return summary


No comments

Theme images by tjasam. Powered by Blogger.