In this project, you will be a creating a Chrome Extension which will make a request to a backend REST API where it will perform NLP and respond with a summarized version of a YouTube transcript.
In this project, you will be a creating a Chrome Extension which will make a request to a backend REST API where it will perform NLP and respond with a summarized version of a YouTube transcript.
Enormous number of video recordings are being created and shared on the Internet through out the day. It has become really difficult to spend time in watching such videos which may have a longer duration than expected and sometimes our efforts may become futile if we couldn't find relevant information out of it. Summarizing transcripts of such videos automatically allows us to quickly look out for the important patterns in the video and helps us to save time and efforts to go through the whole content of the video.
This project will give us an opportunity to have hands on experience with state of the art NLP technique for abstractive text summarization and implement an interesting idea suitable for intermediates and a refreshing hobby project for professionals.
The project consists of the following stages:
In this project, you will be a creating a Chrome Extension which will make a request to a backend REST API where it will perform NLP and respond with a summarized version of a YouTube transcript.
Enormous number of video recordings are being created and shared on the Internet through out the day. It has become really difficult to spend time in watching such videos which may have a longer duration than expected and sometimes our efforts may become futile if we couldn't find relevant information out of it. Summarizing transcripts of such videos automatically allows us to quickly look out for the important patterns in the video and helps us to save time and efforts to go through the whole content of the video.
This project will give us an opportunity to have hands on experience with state of the art NLP technique for abstractive text summarization and implement an interesting idea suitable for intermediates and a refreshing hobby project for professionals.
The project consists of the following stages:
APIs changed the way we build applications, there are countless examples of APIs in the world, and many ways to structure or set up your APIs. In this milestone, we are going to see how to create a back-end application directory and structure it to work with the required files. We are going to isolate the back-end of the application to avoid the conflicting dependencies from other parts of the project.
app.py
and requirements.txt
.app.py
file with basic Flask RESTful BoilerPlate with the tutorial link as mentioned in the Reference Section below.pip
installed which will act as an isolated location (a directory) where everything resides.pip
:-
pip freeze
and redirect the output to the requirements.txt
file. This requirements.txt
file is used for specifying what python packages are required to run the project.You are expected to initialize the back-end portion of your application with the required boiler plate as well as the dependencies.
Ever wondered how to get your YouTube video's transcripts? In this milestone, we are going to utilize a python API which allows you to get the transcripts/subtitles for a given YouTube video. It also works for automatically generated subtitles, supports translating subtitles and it does not require a headless browser, like other selenium based solutions do!
In app.py
,
[
{
'text': 'Hey there',
'start': 7.58,
'duration': 6.13
},
{
'text': 'how are you',
'start': 14.08,
'duration': 7.58
},
...
]
Hey there how are you ...
You should be able to fetch the transcripts with the help of a function created which we will later utilize as a feed input for the NLP processor in the pipeline.
Text summarization is the task of shortening long pieces of text into a concise summary that preserves key information content and overall meaning.
There are two different approaches that are widely used for text summarization:
Extractive Summarization: This is where the model identifies the important sentences and phrases from the original text and only outputs those.
Abstractive Summarization: The model produces a completely different text that is shorter than the original, it generates new sentences in a new form, just like humans do. In this project, we will use transformers for this approach.
In this milestone, we will use HuggingFace's transformers library in Python to perform abstractive text summarization on the transcript obtained from previous milestone.
In app.py
,
Bart
or T5
.PreTrainedModel.generate()
method to generate the summary.You should be able to verify that the model generates a completely new summarized text that is different from the original text.
The next step is to define the resources that will be exposed by this backend service. This is an extremely simple application, we only have a single endpoint, so our only resource will be the summarized text.
In app.py
,
http://[hostname]/api/summarize?youtube_url=<url>
.You should be able to create an endpoint to summarize YouTube video transcripts and test the response with different video URLs.
Extensions are small software programs that customize the browsing experience. They enable users to tailor Chrome functionality and behavior to individual preferences. They are built on web technologies such as HTML, CSS and JavaScript. In this milestone, we are going to see how to create a recommended Chrome extension application directory and structure it to work with the required files.
manifest.json
.{
"manifest_version": 2,
"name": "YSummarize",
"description": "An extension to provide summarized transcript of a YouTube Subtitle eligible Video.",
"version": "1.0",
"permissions": ["activeTab"],
}
chrome://extensions
and turn on developer mode from the top right-hand corner.You should be able to create a recommended Chrome extension application directory and structure it to work with the required files.
We need a user interface so that the user can interact with the popups which are one of several types of user interface that a Chrome extension can provide. They usually appear upon clicking the extension icon in the browser toolbar.
page_action
in the manifest file which enable the User Interface for a Popup.{
.
.
.
"page_action": {
"default_popup": "popup.html",
}
.
.
}
popup.html
file,
popup.css
file to make the styles available to the HTML elements.popup.js
file to enable user interaction and behavior with the HTML elements.button
element named Summarize
which when clicked will emit a click event which will be detected by an event listener to respond to it.div
element where summarized text will be displayed when received from backend REST API Call.popup.css
file,
button
and div
to have a better user experience.The extension user interface should be purposeful and minimal, and must enhance the browsing experience without distracting from it.
We have provided a basic UI to enable users to interact and display the summarized text but there are some missing links which must be addressed. In this milestone, we will add a functionality to allow the extension to interact with the backend server using HTTP REST API Calls.
In popup.js
,
Summarize
button and pass second parameter as an anonymous callback function.generate
using chrome.runtime.sendMessage
method to notify contentScript.js
to execute summary generation.chrome.runtime.onMessage
to listen message result
from contentScript.js
which will execute the outputSummary
callback function.div
element programmatically using Javascript.Add the line below to content_scripts
in the manifest file which will inject the content script contentScript.js
declaratively and execute the script automatically on a particular page.
{
.
.
.
"content_scripts":[
{
"matches":["https://www.youtube.com/watch?v=*"],
"js": ["contentScript.js"]
}
],
.
.
.
}
contentScript.js
,
chrome.runtime.onMessage
to listen message generate
which will execute the generateSummary
callback function.XMLHTTPRequest
Web API to the backend to receive summarized text as a response.result
with summary payload using chrome.runtime.sendMessage
to notify popup.js
to display the summarized text.The extension user interface should be able to display the summarized text upon request from the user.
As the basic implementation is all done, for all the curious cats out there, these are some of the line items which can be implemented to spice up the existing functionality.
[Note: This is not a mandatory milestone.]
You should be able to add more features to your application.