20 Commits

Author SHA1 Message Date
Puyodead1
fa7e566327 some fixes for char decoding errors 2021-11-27 12:41:16 -05:00
Puyodead1
6237f0e61a add warning about account suspensions 2021-11-26 22:39:48 -05:00
Puyodead1
ecadd0b880 Add --disable-ipv6 option
+ Added `--disable-ipv6` option to disable ipv6 in aria2
+ Updated README to reflect argument changes
2021-11-19 16:28:44 -05:00
Puyodead1
07d4698479 update to allow downloading if using udemy subscription
+ New requirements: `beautifulsoup4` and `lxml`
+ Added support for downloading courses included in subscription plans
+ Updated README to reflect changes
2021-11-10 09:10:51 -05:00
Puyodead1
4758b5a5ab Fix typo in readme 2021-11-01 10:35:43 -04:00
Puyodead1
408072be84 Update version.py 2021-11-01 10:33:04 -04:00
Puyodead1
d736df6739 Update version.py 2021-11-01 10:32:00 -04:00
Puyodead1
19cb48ad13 ability to specify an encoder and custom framerate (Closes #57)
+ Added command line argument `ffmpeg-framerate` and `h265-encoder`
+ Added some info logging about selected options
+ Updated README to reflect new arguments
2021-11-01 10:27:58 -04:00
Puyodead1
5d60bfe1ce Fix 403 errors when fetching course information 2021-10-18 11:07:49 -04:00
Puyodead1
b4a4c1027e fix: --info throwing an error 2021-10-12 17:44:32 -04:00
Puyodead1
38f58213f4 remove asset id from start of file name 2021-09-27 08:32:57 -04:00
Puyodead1
6f85fdaaa1 fix: .env file not being loaded before argument parser 2021-09-25 16:09:06 -04:00
Puyodead1
5915b28054 fix: non-drm videos not using h265 if specified 2021-09-25 15:35:37 -04:00
Puyodead1
f3a32a2dd6 Bug fixes
- Fixed captions not being downloaded
- Fixed trying to load keyfile even if it doesn't exist
- Moved asset and subtitle download processing into lecture processing function (in preparation of subtitle merging)
- Fixed an error in ffmpeg command when not using h265
- no longer need to specify full path to UdemyDownloader.py, also updated readme to reflect this
2021-09-25 12:14:08 -04:00
Puyodead1
ec6ac28d0b add missing dependency 2021-09-24 23:50:42 -04:00
Puyodead1
7750256e4c fix
python tells me one thing, while my editor tells me another 😒
2021-09-24 23:37:04 -04:00
Puyodead1
4b2126c839 Update README.md
add a little bit more setup information
2021-09-24 18:32:13 -04:00
Puyodead1
cfc0fa7ce9 Update README.md 2021-09-24 18:27:36 -04:00
Puyodead1
49bbdc9f56 update readme
- document h265 options
- update example commands
2021-09-24 18:26:53 -04:00
Puyodead1
d925ff240b updates and new features
- split main.py into smaller classes
- add support for FFMPEG h.265 (closes #55)
2021-09-24 18:22:19 -04:00
21 changed files with 2196 additions and 1802 deletions

16
.github/stale.yml vendored Normal file
View File

@@ -0,0 +1,16 @@
# Number of days of inactivity before an issue becomes stale
daysUntilStale: 60
# Number of days of inactivity before a stale issue is closed
daysUntilClose: 7
# Issues with these labels will never be considered stale
exemptLabels:
- pinned
- security
# Label to use when marking an issue as stale
staleLabel: stale
# Comment to post when marking an issue as stale. Set to `false` to disable
markComment: >
This issue has been automatically marked as stale because it has not had
recent activity. It will be closed if no further activity occurs.
# Comment to post when closing a stale issue. Set to `false` to disable
closeComment: true

3
.gitignore vendored
View File

@@ -122,4 +122,5 @@ manifest.mpd
saved
*.aria2
info.py
.idea/
.idea/
cookies.txt

124
README.md
View File

@@ -11,6 +11,7 @@
# NOTE
- **This tool will not work without decryption keys, and there currently no public way to get those keys. Do not bother installing unless you already have keys!**
- **Downloading courses is against Udemy's Terms of Service, I am NOT held responsible for your account getting suspended as a result from the use of this program!**
- This program is WIP, the code is provided as-is and I am not held resposible for any legal issues resulting from the use of this program.
# Support
@@ -23,12 +24,12 @@ All code is licensed under the MIT license
# Description
Utility script to download Udemy courses, has support for DRM videos but requires the user to aquire the decryption key (for legal reasons).<br>
Utility script to download Udemy courses, has support for DRM videos but requires the user to acquire the decryption key (for legal reasons).<br>
Windows is the primary development OS, but I've made an effort to support Linux also.
# Requirements
1. You would need to download `ffmpeg`, `aria2c`, `mp4decrypt` (from Bento4 SDK) and ``yt-dlp`` (``pip install yt-dlp``). Ensure they are in the system path (typing their name in cmd should invoke them).
1. You would need to download `ffmpeg`, `aria2c`, `mp4decrypt` (from Bento4 SDK) and `yt-dlp` (this is installed with the other requirements). Ensure they are in the system path (typing their name in cmd should invoke them).
# Usage
@@ -43,10 +44,12 @@ You will need to get a few things before you can use this program:
### Setting up
- install python 3.6+
- install requirements: `pip install -r requirements.txt`
- rename `.env.sample` to `.env` _(you only need to do this if you plan to use the .env file to store your bearer token)_
- rename `keyfile.example.json` to `keyfile.json`
- rename `keyfile.example.json` to `keyfile.json` _(this is only required if you plan to download DRM encrypted lectures)_
### Aquire Bearer Token
### Acquire Bearer Token
- Firefox: [Udemy-DL Guide](https://github.com/r0oth3x49/udemy-dl/issues/389#issuecomment-491903900)
- Chrome: [Udemy-DL Guide](https://github.com/r0oth3x49/udemy-dl/issues/389#issuecomment-492569372)
@@ -54,7 +57,7 @@ You will need to get a few things before you can use this program:
### Key ID and Key
It is up to you to aquire the key and key ID. Please don't ask me for help acquiring these, decrypting DRM protected content can be considered piracy.
I would rather not instruct you how to get these as its a grey area in terms of legality. I would prefer if you don't ask me for help getting these.
- Enter the key and key id in the `keyfile.json`
- ![keyfile example](https://i.imgur.com/e5aU0ng.png)
@@ -64,65 +67,120 @@ It is up to you to aquire the key and key ID. Please don't ask me for help acqui
You can now run the program, see the examples below. The course will download to `out_dir`.
# Udemy Subscription Plans
To download a course included in a subscription plan that you did not purchase individually, you will need to follow a few more steps to get setup.
## Getting your cookies
- Go to the page of the course you want to download
- press `control` + `shift` + `i` (this may be different depending on your OS, just google how to open developer tools)
- click the `Console` tab
- copy and paste `document.cookie` and press enter
- copy the text between the quotes
## Setup token file
- Create a file called `cookies.txt`
- Paste the cookie into the file
- save and close the file
# Advanced Usage
```
usage: main.py [-h] -c COURSE_URL [-b BEARER_TOKEN] [-q QUALITY] [-l LANG] [-cd CONCURRENT_DOWNLOADS] [--skip-lectures] [--download-assets]
[--download-captions] [--keep-vtt] [--skip-hls] [--info]
usage: udemy_downloader [-h] -c COURSE_URL [-b BEARER_TOKEN] [-q QUALITY] [-l LANG] [-cd CONCURRENT_CONNECTIONS]
[--skip-lectures] [--download-assets] [--download-captions] [--keep-vtt] [--skip-hls] [--info]
[--use-h265] [--h265-crf H265_CRF] [--ffmpeg-preset FFMPEG_PRESET]
[--ffmpeg-framerate FFMPEG_FRAMERATE] [--h265-encoder H265_ENCODER] [--disable-ipv6] [-v]
Udemy Downloader
optional arguments:
options:
-h, --help show this help message and exit
-c COURSE_URL, --course-url COURSE_URL
The URL of the course to download
-b BEARER_TOKEN, --bearer BEARER_TOKEN
The Bearer token to use
-q QUALITY, --quality QUALITY
Download specific video quality. If the requested quality isn't available, the closest quality will be used. If not
specified, the best quality will be downloaded for each lecture
-l LANG, --lang LANG The language to download for captions, specify 'all' to download all captions (Default is 'en')
-cd CONCURRENT_DOWNLOADS, --concurrent-downloads CONCURRENT_DOWNLOADS
The number of maximum concurrent downloads for segments (HLS and DASH, must be a number 1-50)
Download specific video quality. If the requested quality isn't available, the closest quality
will be used. If not specified, the best quality will be downloaded for each lecture
-l LANG, --lang LANG The language to download for captions, specify 'all' to download all captions (Default is
'en')
-cd CONCURRENT_CONNECTIONS, --concurrent-connections CONCURRENT_CONNECTIONS
The number of maximum concurrent connections per download for segments (HLS and DASH, must be
a number 1-30)
--skip-lectures If specified, lectures won't be downloaded
--download-assets If specified, lecture assets will be downloaded
--download-captions If specified, captions will be downloaded
--keep-vtt If specified, .vtt files won't be removed
--skip-hls If specified, hls streams will be skipped (faster fetching) (hls streams usually contain 1080p quality for non-drm
lectures)
--skip-hls If specified, hls streams will be skipped (faster fetching) (hls streams usually contain 1080p
quality for non-drm lectures)
--info If specified, only course information will be printed, nothing will be downloaded
--use-h265 If specified, videos will be encoded with the H.265 codec
--h265-crf H265_CRF Set a custom CRF value for H.265 encoding. FFMPEG default is 28
--ffmpeg-preset FFMPEG_PRESET
Set a custom preset value for encoding. This can vary depending on the encoder
--ffmpeg-framerate FFMPEG_FRAMERATE
Changes the FPS used for encoding. FFMPEG default is 30
--h265-encoder H265_ENCODER
Changes the HEVC encder that is used. Default is copy when not using h265, otherwise the
default is libx265
--disable-ipv6 If specified, ipv6 will be disabled in aria2
-v, --version show program's version number and exit
```
- Passing a Bearer Token and Course ID as an argument
- `python main.py -c <Course URL> -b <Bearer Token>`
- `python main.py -c https://www.udemy.com/courses/myawesomecourse -b <Bearer Token>`
- `python udemy_downloader -c <Course URL> -b <Bearer Token>`
- `python udemy_downloader -c https://www.udemy.com/courses/myawesomecourse -b <Bearer Token>`
- Download a specific quality
- `python main.py -c <Course URL> -q 720`
- `python udemy_downloader -c <Course URL> -q 720`
- Download assets along with lectures
- `python main.py -c <Course URL> --download-assets`
- `python udemy_downloader -c <Course URL> --download-assets`
- Download assets and specify a quality
- `python main.py -c <Course URL> -q 360 --download-assets`
- `python udemy_downloader -c <Course URL> -q 360 --download-assets`
- Download captions (Defaults to English)
- `python main.py -c <Course URL> --download-captions`
- `python udemy_downloader -c <Course URL> --download-captions`
- Download captions with specific language
- `python main.py -c <Course URL> --download-captions -l en` - English subtitles
- `python main.py -c <Course URL> --download-captions -l es` - Spanish subtitles
- `python main.py -c <Course URL> --download-captions -l it` - Italian subtitles
- `python main.py -c <Course URL> --download-captions -l pl` - Polish Subtitles
- `python main.py -c <Course URL> --download-captions -l all` - Downloads all subtitles
- `python udemy_downloader -c <Course URL> --download-captions -l en` - English subtitles
- `python udemy_downloader -c <Course URL> --download-captions -l es` - Spanish subtitles
- `python udemy_downloader -c <Course URL> --download-captions -l it` - Italian subtitles
- `python udemy_downloader -c <Course URL> --download-captions -l pl` - Polish Subtitles
- `python udemy_downloader -c <Course URL> --download-captions -l all` - Downloads all subtitles
- etc
- Skip downloading lecture videos
- `python main.py -c <Course URL> --skip-lectures --download-captions` - Downloads only captions
- `python main.py -c <Course URL> --skip-lectures --download-assets` - Downloads only assets
- `python udemy_downloader -c <Course URL> --skip-lectures --download-captions` - Downloads only captions
- `python udemy_downloader -c <Course URL> --skip-lectures --download-assets` - Downloads only assets
- Keep .VTT caption files:
- `python main.py -c <Course URL> --download-captions --keep-vtt`
- `python udemy_downloader -c <Course URL> --download-captions --keep-vtt`
- Skip parsing HLS Streams (HLS streams usually contain 1080p quality for Non-DRM lectures):
- `python main.py -c <Course URL> --skip-hls`
- `python udemy_downloader -c <Course URL> --skip-hls`
- Print course information only:
- `python main.py -c <Course URL> --info`
- `python udemy_downloader -c <Course URL> --info`
- Specify max number of concurrent downloads:
- `python main.py -c <Course URL> --concurrent-downloads 20`
- `python main.py -c <Course URL> -cd 20`
- `python udemy_downloader -c <Course URL> --concurrent-downloads 20`
- `python udemy_downloader -c <Course URL> -cd 20`
- Encode in H.265:
- `python udemy_downloader -c <Course URL> --use-h265`
- Encode in H.265 with custom CRF:
- `python udemy_downloader -c <Course URL> --use-h265 -h265-crf 20`
- Encode in H.265 with custom preset using the default encoder (libx265):
- `python udemy_downloader -c <Course URL> --use-h265 --h265-preset faster`
- Encode in H.265 with custom preset using a custom encoder:
- **Note**: _The presets may be different depending on the encoder! For example: `hevc_nvenc` default is `p4` and `libx265` is `medium`_
- _You can view encoder help with `ffmpeg -h encoder=<encoder name>`, ex: `ffmpeg -h encoder=hevc_nvenc`_
- `python udemy_downloader -c <Course URL> --use-h265 --h265-encoder hevc_nvenc --h265-preset p7`
- Encode in H.265 with a custom framerate:
- `python udemy_downloader -c <Course URL> --use-h265 --ffmpeg-framerate 24`
If you encounter errors while downloading such as
`errorCode=1 Network problem has occurred. cause:Unknown socket error 10051 (0x2743)`
or
`errorCode=1 Network problem has occurred. cause:A socket operation was attempted to an unreachable network.`
Then try disabling ipv6 in aria2 using the `--disable-ipv6` option
# Credits

View File

@@ -1 +0,0 @@
__version__ = "1.1.2"

1732
main.py

File diff suppressed because it is too large Load Diff

View File

@@ -10,5 +10,7 @@ m3u8
colorama
yt-dlp
bitstring
cloudscraper
unidecode
six
beautifulsoup4
lxml

31
setup.py Normal file
View File

@@ -0,0 +1,31 @@
from setuptools import setup, Command, find_packages
exec(compile(open('udemy_downloader/version.py').read(), 'udemy_downloader/version.py', 'exec'))
packages = find_packages()
setup(
name="udemy-downloader",
version="1.2.2",
author="Puyodead1",
author_email="puyodead@protonmail.com",
description="Utility script to download DRM encrypted lectures from Udemy",
url="https://github.com/Puyodead1/udemy-downloader",
project_urls={
"Bug Tracker": "https://github.com/Puyodead1/udemy-downloader/issues",
},
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
"Natural Language :: English",
"Topic :: Multimedia",
"Topic :: Utilities"
],
install_requires=["mpegdash", "sanitize_filename", "tqdm", "requests", "python-dotenv", "protobuf", "webvtt-py", "pysrt", "m3u8", "colorama", "yt-dlp", "bitstring", "unidecode", "six"],
packages=packages,
python_requires=">=3.6",
entry_points={
'console_scripts': ["udemy-downloader = udemy_downloader:UdemyDownloader"]
}
)

View File

@@ -0,0 +1,59 @@
"""
This file was modified from udemy-dl
https://github.com/r0oth3x49/udemy-dl/
Copyright (c) 2018-2025 Nasir Khan (r0ot h3x49)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the
Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH
THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""
import time
import requests
from constants import HEADERS
class Session(object):
def __init__(self):
self._headers = HEADERS
self._session = requests.sessions.Session()
def _set_auth_headers(self, access_token="", cookies={}):
self._headers["Authorization"] = "Bearer {}".format(access_token)
self._headers["X-Udemy-Authorization"] = "Bearer {}".format(
access_token)
self._headers["Cookie"] = cookies
def _get(self, url):
for i in range(10):
session = self._session.get(url, headers=self._headers)
if session.ok or session.status_code in [502, 503]:
return session
if not session.ok:
print('Failed request '+url)
print(
f"{session.status_code} {session.reason}, retrying (attempt {i} )...")
time.sleep(0.8)
def _post(self, url, data, redirect=True):
session = self._session.post(url,
data,
headers=self._headers,
allow_redirects=redirect)
if session.ok:
return session
if not session.ok:
raise Exception(f"{session.status_code} {session.reason}")
def terminate(self):
self._set_auth_headers()
return

596
udemy_downloader/Udemy.py Normal file
View File

@@ -0,0 +1,596 @@
"""
This file was modified from udemy-dl
https://github.com/r0oth3x49/udemy-dl/
Copyright (c) 2018-2025 Nasir Khan (r0ot h3x49)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the
Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH
THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""
import time
import sys
import m3u8
import yt_dlp
import re
import json
from requests.exceptions import ConnectionError as conn_error
from UdemyAuth import UdemyAuth
from utils import _clean
from constants import COURSE_SEARCH, COURSE_URL, COURSE_INFO_URL, MY_COURSES_URL, COLLECTION_URL
from bs4 import BeautifulSoup
class Udemy:
def __init__(self, access_token, cookies):
self.session = None
self.access_token = None
self.auth = UdemyAuth(cache_session=False)
if not self.session:
self.session, self.access_token = self.auth.authenticate(
access_token=access_token, cookies=cookies)
if self.session and self.access_token:
self.session._headers.update(
{"Authorization": "Bearer {}".format(self.access_token)})
self.session._headers.update({
"X-Udemy-Authorization":
"Bearer {}".format(self.access_token)
})
print("Login Success")
else:
print("Login Failure!")
sys.exit(1)
def _extract_supplementary_assets(self, supp_assets):
_temp = []
for entry in supp_assets:
title = _clean(entry.get("title"))
filename = entry.get("filename")
download_urls = entry.get("download_urls")
external_url = entry.get("external_url")
asset_type = entry.get("asset_type").lower()
id = entry.get("id")
if asset_type == "file":
if download_urls and isinstance(download_urls, dict):
extension = filename.rsplit(
".", 1)[-1] if "." in filename else ""
download_url = download_urls.get("File", [])[0].get("file")
_temp.append({
"type": "file",
"title": title,
"filename": filename,
"extension": extension,
"download_url": download_url,
"id": id
})
elif asset_type == "sourcecode":
if download_urls and isinstance(download_urls, dict):
extension = filename.rsplit(
".", 1)[-1] if "." in filename else ""
download_url = download_urls.get("SourceCode",
[])[0].get("file")
_temp.append({
"type": "source_code",
"title": title,
"filename": filename,
"extension": extension,
"download_url": download_url,
"id": id
})
elif asset_type == "externallink":
_temp.append({
"type": "external_link",
"title": title,
"filename": filename,
"extension": "txt",
"download_url": external_url,
"id": id
})
return _temp
def _extract_ppt(self, asset):
_temp = []
download_urls = asset.get("download_urls")
filename = asset.get("filename")
id = asset.get("id")
if download_urls and isinstance(download_urls, dict):
extension = filename.rsplit(".", 1)[-1] if "." in filename else ""
download_url = download_urls.get("Presentation", [])[0].get("file")
_temp.append({
"type": "presentation",
"filename": filename,
"extension": extension,
"download_url": download_url,
"id": id
})
return _temp
def _extract_file(self, asset):
_temp = []
download_urls = asset.get("download_urls")
filename = asset.get("filename")
id = asset.get("id")
if download_urls and isinstance(download_urls, dict):
extension = filename.rsplit(".", 1)[-1] if "." in filename else ""
download_url = download_urls.get("File", [])[0].get("file")
_temp.append({
"type": "file",
"filename": filename,
"extension": extension,
"download_url": download_url,
"id": id
})
return _temp
def _extract_ebook(self, asset):
_temp = []
download_urls = asset.get("download_urls")
filename = asset.get("filename")
id = asset.get("id")
if download_urls and isinstance(download_urls, dict):
extension = filename.rsplit(".", 1)[-1] if "." in filename else ""
download_url = download_urls.get("E-Book", [])[0].get("file")
_temp.append({
"type": "ebook",
"filename": filename,
"extension": extension,
"download_url": download_url,
"id": id
})
return _temp
def _extract_audio(self, asset):
_temp = []
download_urls = asset.get("download_urls")
filename = asset.get("filename")
id = asset.get("id")
if download_urls and isinstance(download_urls, dict):
extension = filename.rsplit(".", 1)[-1] if "." in filename else ""
download_url = download_urls.get("Audio", [])[0].get("file")
_temp.append({
"type": "audio",
"filename": filename,
"extension": extension,
"download_url": download_url,
"id": id
})
return _temp
def _extract_sources(self, sources, skip_hls):
_temp = []
if sources and isinstance(sources, list):
for source in sources:
label = source.get("label")
download_url = source.get("file")
if not download_url:
continue
if label.lower() == "audio":
continue
height = label if label else None
if height == "2160":
width = "3840"
elif height == "1440":
width = "2560"
elif height == "1080":
width = "1920"
elif height == "720":
width = "1280"
elif height == "480":
width = "854"
elif height == "360":
width = "640"
elif height == "240":
width = "426"
else:
width = "256"
if (source.get("type") == "application/x-mpegURL"
or "m3u8" in download_url):
if not skip_hls:
out = self._extract_m3u8(download_url)
if out:
_temp.extend(out)
else:
_type = source.get("type")
_temp.append({
"type": "video",
"height": height,
"width": width,
"extension": _type.replace("video/", ""),
"download_url": download_url,
})
return _temp
def _extract_media_sources(self, sources):
_temp = []
if sources and isinstance(sources, list):
for source in sources:
_type = source.get("type")
src = source.get("src")
if _type == "application/dash+xml":
out = self._extract_mpd(src)
if out:
_temp.extend(out)
return _temp
def _extract_subtitles(self, tracks):
_temp = []
if tracks and isinstance(tracks, list):
for track in tracks:
if not isinstance(track, dict):
continue
if track.get("_class") != "caption":
continue
download_url = track.get("url")
if not download_url or not isinstance(download_url, str):
continue
lang = (track.get("language") or track.get("srclang")
or track.get("label")
or track["locale_id"].split("_")[0])
ext = "vtt" if "vtt" in download_url.rsplit(".",
1)[-1] else "srt"
_temp.append({
"type": "subtitle",
"language": lang,
"extension": ext,
"download_url": download_url,
})
return _temp
def _extract_m3u8(self, url):
"""extracts m3u8 streams"""
_temp = []
try:
resp = self.session._get(url)
resp.raise_for_status()
raw_data = resp.text
m3u8_object = m3u8.loads(raw_data)
playlists = m3u8_object.playlists
seen = set()
for pl in playlists:
resolution = pl.stream_info.resolution
codecs = pl.stream_info.codecs
if not resolution:
continue
if not codecs:
continue
width, height = resolution
download_url = pl.uri
if height not in seen:
seen.add(height)
_temp.append({
"type": "hls",
"height": height,
"width": width,
"extension": "mp4",
"download_url": download_url,
})
except Exception as error:
print(f"Udemy Says : '{error}' while fetching hls streams..")
return _temp
def _extract_mpd(self, url):
"""extracts mpd streams"""
_temp = []
try:
ytdl = yt_dlp.YoutubeDL({
'quiet': True,
'no_warnings': True,
"allow_unplayable_formats": True
})
results = ytdl.extract_info(url,
download=False,
force_generic_extractor=True)
seen = set()
formats = results.get("formats")
format_id = results.get("format_id")
best_audio_format_id = format_id.split("+")[1]
best_audio = next((x for x in formats
if x.get("format_id") == best_audio_format_id),
None)
for f in formats:
if "video" in f.get("format_note"):
# is a video stream
format_id = f.get("format_id")
extension = f.get("ext")
height = f.get("height")
width = f.get("width")
if height and height not in seen:
seen.add(height)
_temp.append({
"type": "dash",
"height": str(height),
"width": str(width),
"format_id": f"{format_id},{best_audio_format_id}",
"extension": extension,
"download_url": f.get("manifest_url")
})
else:
# unknown format type
continue
except Exception as error:
print(f"Error fetching MPD streams: '{error}'")
return _temp
def extract_course_name(self, url):
"""
@author r0oth3x49
"""
obj = re.search(
r"(?i)(?://(?P<portal_name>.+?).udemy.com/(?:course(/draft)*/)?(?P<name_or_id>[a-zA-Z0-9_-]+))",
url,
)
if obj:
return obj.group("portal_name"), obj.group("name_or_id")
def extract_portal_name(self, url):
obj = re.search(r"(?i)(?://(?P<portal_name>.+?).udemy.com)", url)
if obj:
return obj.group("portal_name")
def _subscribed_courses(self, portal_name, course_name):
results = []
self.session._headers.update({
"Host":
"{portal_name}.udemy.com".format(portal_name=portal_name),
"Referer":
"https://{portal_name}.udemy.com/home/my-courses/search/?q={course_name}"
.format(portal_name=portal_name, course_name=course_name),
})
url = COURSE_SEARCH.format(portal_name=portal_name,
course_name=course_name)
try:
webpage = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception) as error:
print(f"Udemy Says: {error} on {url}")
time.sleep(0.8)
sys.exit(0)
else:
results = webpage.get("results", [])
return results
def _extract_course_info_json(self, url, course_id, portal_name):
self.session._headers.update({"Referer": url})
url = COURSE_INFO_URL.format(
portal_name=portal_name, course_id=course_id)
try:
resp = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
else:
return resp
def _extract_course_json(self, url, course_id, portal_name):
self.session._headers.update({"Referer": url})
url = COURSE_URL.format(portal_name=portal_name, course_id=course_id)
try:
resp = self.session._get(url)
if resp.status_code in [502, 503]:
print(
"> The course content is large, using large content extractor..."
)
resp = self._extract_large_course_content(url=url)
else:
resp = resp.json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception):
resp = self._extract_large_course_content(url=url)
return resp
else:
return resp
def _extract_large_course_content(self, url):
url = url.replace("10000", "50") if url.endswith("10000") else url
try:
data = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
else:
_next = data.get("next")
while _next:
print("Downloading course information.. ")
try:
resp = self.session._get(_next).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
else:
_next = resp.get("next")
results = resp.get("results")
if results and isinstance(results, list):
for d in resp["results"]:
data["results"].append(d)
return data
def _extract_course(self, response, course_name):
_temp = {}
if response:
for entry in response:
course_id = str(entry.get("id"))
published_title = entry.get("published_title")
if course_name in (published_title, course_id):
_temp = entry
break
return _temp
def _my_courses(self, portal_name):
results = []
try:
url = MY_COURSES_URL.format(portal_name=portal_name)
webpage = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception) as error:
print(f"Udemy Says: {error}")
time.sleep(0.8)
sys.exit(0)
else:
results = webpage.get("results", [])
return results
def _subscribed_collection_courses(self, portal_name):
url = COLLECTION_URL.format(portal_name=portal_name)
courses_lists = []
try:
webpage = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception) as error:
print(f"Udemy Says: {error}")
time.sleep(0.8)
sys.exit(0)
else:
results = webpage.get("results", [])
if results:
[
courses_lists.extend(courses.get("courses", []))
for courses in results if courses.get("courses", [])
]
return courses_lists
def _archived_courses(self, portal_name):
results = []
try:
url = MY_COURSES_URL.format(portal_name=portal_name)
url = f"{url}&is_archived=true"
webpage = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception) as error:
print(f"Udemy Says: {error}")
time.sleep(0.8)
sys.exit(0)
else:
results = webpage.get("results", [])
return results
def _my_courses(self, portal_name):
results = []
try:
url = MY_COURSES_URL.format(portal_name=portal_name)
webpage = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception) as error:
print(f"Udemy Says: {error}")
time.sleep(0.8)
sys.exit(0)
else:
results = webpage.get("results", [])
return results
def _subscribed_collection_courses(self, portal_name):
url = COLLECTION_URL.format(portal_name=portal_name)
courses_lists = []
try:
webpage = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception) as error:
print(f"Udemy Says: {error}")
time.sleep(0.8)
sys.exit(0)
else:
results = webpage.get("results", [])
if results:
[
courses_lists.extend(courses.get("courses", []))
for courses in results if courses.get("courses", [])
]
return courses_lists
def _archived_courses(self, portal_name):
results = []
try:
url = MY_COURSES_URL.format(portal_name=portal_name)
url = f"{url}&is_archived=true"
webpage = self.session._get(url).json()
except conn_error as error:
print(f"Udemy Says: Connection error, {error}")
time.sleep(0.8)
sys.exit(0)
except (ValueError, Exception) as error:
print(f"Udemy Says: {error}")
time.sleep(0.8)
sys.exit(0)
else:
results = webpage.get("results", [])
return results
def _extract_course_info(self, url):
portal_name, course_name = self.extract_course_name(url)
course = {}
results = self._subscribed_courses(portal_name=portal_name,
course_name=course_name)
course = self._extract_course(response=results,
course_name=course_name)
if not course:
results = self._my_courses(portal_name=portal_name)
course = self._extract_course(response=results,
course_name=course_name)
if not course:
results = self._subscribed_collection_courses(
portal_name=portal_name)
course = self._extract_course(response=results,
course_name=course_name)
if not course:
results = self._archived_courses(portal_name=portal_name)
course = self._extract_course(response=results,
course_name=course_name)
if not course:
course_html = self.session._get(url).text
soup = BeautifulSoup(course_html, "lxml")
data_args = soup.find(
"div", {"class": "ud-component--course-taking--app"}).attrs["data-module-args"]
data_json = json.loads(data_args)
course_id = data_json.get("courseId", None)
portal_name = self.extract_portal_name(url)
course = self._extract_course_info_json(
url, course_id, portal_name)
if course:
course.update({"portal_name": portal_name})
return course
if not course:
print("Downloading course information, course id not found .. ")
print(
"It seems either you are not enrolled or you have to visit the course atleast once while you are logged in.",
)
print("Trying to logout now...", )
self.session.terminate()
print("Logged out successfully.", )
sys.exit(0)

View File

@@ -0,0 +1,38 @@
"""
This file was modified from udemy-dl
https://github.com/r0oth3x49/udemy-dl/
Copyright (c) 2018-2025 Nasir Khan (r0ot h3x49)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the
Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH
THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""
from Session import Session
import sys
class UdemyAuth(object):
def __init__(self, username="", password="", cache_session=False):
self.username = username
self.password = password
self._cache = cache_session
self._session = Session()
def authenticate(self, access_token, cookies):
if access_token:
self._session._set_auth_headers(
access_token=access_token, cookies=cookies)
self._session._session.cookies.update(
{"access_token": access_token})
return self._session, access_token
else:
raise RuntimeError("No access token is present")

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1 @@
from UdemyDownloader import UdemyDownloader

View File

@@ -0,0 +1,4 @@
from UdemyDownloader import UdemyDownloader
if __name__ == "__main__":
UdemyDownloader()

View File

@@ -0,0 +1,16 @@
HEADERS = {
"Origin": "www.udemy.com",
# "User-Agent":
# "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:90.0) Gecko/20100101 Firefox/90.0",
"Accept": "*/*",
"Accept-Encoding": None,
}
LOGIN_URL = "https://www.udemy.com/join/login-popup/?ref=&display_type=popup&loc"
LOGOUT_URL = "https://www.udemy.com/user/logout"
COURSE_INFO_URL = "https://{portal_name}.udemy.com/api-2.0/courses/{course_id}/"
COURSE_URL = "https://{portal_name}.udemy.com/api-2.0/courses/{course_id}/cached-subscriber-curriculum-items?fields[asset]=results,title,external_url,time_estimation,download_urls,slide_urls,filename,asset_type,captions,media_license_token,course_is_drmed,media_sources,stream_urls,body&fields[chapter]=object_index,title,sort_order&fields[lecture]=id,title,object_index,asset,supplementary_assets,view_html&page_size=10000"
COURSE_SEARCH = "https://{portal_name}.udemy.com/api-2.0/users/me/subscribed-courses?fields[course]=id,url,title,published_title&page=1&page_size=500&search={course_name}"
SUBSCRIBED_COURSES = "https://{portal_name}.udemy.com/api-2.0/users/me/subscribed-courses/?ordering=-last_accessed&fields[course]=id,title,url&page=1&page_size=12"
MY_COURSES_URL = "https://{portal_name}.udemy.com/api-2.0/users/me/subscribed-courses?fields[course]=id,url,title,published_title&ordering=-last_accessed,-access_time&page=1&page_size=10000"
COLLECTION_URL = "https://{portal_name}.udemy.com/api-2.0/users/me/subscribed-courses-collections/?collection_has_courses=True&course_limit=20&fields[course]=last_accessed_time,title,published_title&fields[user_has_subscribed_courses_collection]=@all&page=1&page_size=1000"

View File

@@ -4,6 +4,27 @@ http://download.macromedia.com/f4v/video_file_format_spec_v10_1.pdf
@author: Alastair McCormack
@license: MIT License
The MIT License (MIT)
Copyright (c) 2015 use-sparingly
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"""
import bitstring

View File

@@ -1,4 +1,17 @@
# This file is from https://github.com/r0oth3x49/udemy-dl/blob/master/udemy/sanitize.py
"""
Copyright (c) 2018-2025 Nasir Khan (r0ot h3x49)
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the
Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software,
and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR
ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH
THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
"""
from __future__ import unicode_literals

177
udemy_downloader/utils.py Normal file
View File

@@ -0,0 +1,177 @@
import codecs
import base64
import re
import os
import glob
import subprocess
import sys
from mp4parse import F4VParser
from widevine_pssh_pb2 import WidevinePsshData
from sanitize import sanitize, slugify, SLUG_OK
def extract_kid(mp4_file):
"""
Parameters
----------
mp4_file : str
MP4 file with a PSSH header
Returns
-------
String
"""
boxes = F4VParser.parse(filename=mp4_file)
for box in boxes:
if box.header.box_type == 'moov':
pssh_box = next(x for x in box.pssh if x.system_id ==
"edef8ba979d64acea3c827dcd51d21ed")
hex = codecs.decode(pssh_box.payload, "hex")
pssh = WidevinePsshData()
pssh.ParseFromString(hex)
content_id = base64.b16encode(pssh.content_id)
return content_id.decode("utf-8")
# No Moof or PSSH header found
return None
def _clean(text):
ok = re.compile(r'[^\\/:*?!"<>|]')
text = "".join(x if ok.match(x) else "_" for x in text)
text = re.sub(r"\.+$", "", text.strip())
return text
def _sanitize(self, unsafetext):
text = _clean(sanitize(
slugify(unsafetext, lower=False, spaces=True, ok=SLUG_OK + "().[]")))
return text
def durationtoseconds(period):
"""
@author Jayapraveen
"""
# Duration format in PTxDxHxMxS
if (period[:2] == "PT"):
period = period[2:]
day = int(period.split("D")[0] if 'D' in period else 0)
hour = int(period.split("H")[0].split("D")[-1] if 'H' in period else 0)
minute = int(
period.split("M")[0].split("H")[-1] if 'M' in period else 0)
second = period.split("S")[0].split("M")[-1]
print("Total time: " + str(day) + " days " + str(hour) + " hours " +
str(minute) + " minutes and " + str(second) + " seconds")
total_time = float(
str((day * 24 * 60 * 60) + (hour * 60 * 60) + (minute * 60) +
(int(second.split('.')[0]))) + '.' +
str(int(second.split('.')[-1])))
return total_time
else:
print("Duration Format Error")
return None
def cleanup(path):
"""
@author Jayapraveen
"""
leftover_files = glob.glob(path + '/*.mp4', recursive=True)
for file_list in leftover_files:
try:
os.remove(file_list)
except OSError:
print(f"Error deleting file: {file_list}")
os.removedirs(path)
def remove_files(files):
for file in files:
os.remove(file)
def merge(video_title, video_filepath, audio_filepath, output_path, use_h265, h265_crf, ffmpeg_preset, h265_encoder, ffmpeg_framerate):
"""
@author Jayapraveen
"""
if os.name == "nt":
if use_h265:
command = "ffmpeg -y -i \"{}\" -i \"{}\" -c:v {} -filter:v fps={} -crf {} -preset {} -c:a copy -fflags +bitexact -map_metadata -1 -metadata title=\"{}\" \"{}\"".format(
video_filepath, audio_filepath, h265_encoder, ffmpeg_framerate, h265_crf, ffmpeg_preset, video_title, output_path)
else:
command = "ffmpeg -y -i \"{}\" -i \"{}\" -c:v {} -filter:v fps={} -preset {} -c:a copy -fflags +bitexact -map_metadata -1 -metadata title=\"{}\" \"{}\"".format(
video_filepath, audio_filepath, h265_encoder, ffmpeg_framerate, ffmpeg_preset, video_title, output_path)
else:
if use_h265:
command = "nide -n 7 ffmpeg -y -i \"{}\" -i \"{}\" -c:v {} -filter:v fps={} -crf {} -preset {} -c:a copy -fflags +bitexact -map_metadata -1 -metadata title=\"{}\" \"{}\"".format(
video_filepath, audio_filepath, h265_encoder, ffmpeg_framerate, h265_crf, ffmpeg_preset, video_title, output_path)
else:
command = "nide -n 7 ffmpeg -y -i \"{}\" -i \"{}\" -c:v {} -filter:v fps={} -preset {} -c:a copy -fflags +bitexact -map_metadata -1 -metadata title=\"{}\" \"{}\"".format(
video_filepath, audio_filepath, h265_encoder, ffmpeg_framerate, ffmpeg_preset, video_title, output_path)
return os.system(command)
def decrypt(key, in_filepath, out_filepath):
"""
@author Jayapraveen
"""
if (os.name == "nt"):
ret_code = os.system(f"mp4decrypt --key 1:%s \"%s\" \"%s\"" %
(key, in_filepath, out_filepath))
else:
ret_code = os.system(f"nice -n 7 mp4decrypt --key 1:%s \"%s\" \"%s\"" %
(key, in_filepath, out_filepath))
return ret_code
def check_for_aria():
try:
subprocess.Popen(["aria2c", "-v"],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL).wait()
return True
except FileNotFoundError:
return False
except Exception as e:
print(
"> Unexpected exception while checking for Aria2c, please tell the program author about this! ",
e)
return True
def check_for_ffmpeg():
try:
subprocess.Popen(["ffmpeg"],
stderr=subprocess.DEVNULL,
stdout=subprocess.DEVNULL).wait()
return True
except FileNotFoundError:
return False
except Exception as e:
print(
"> Unexpected exception while checking for FFMPEG, please tell the program author about this! ",
e)
return True
def check_for_mp4decrypt():
try:
subprocess.Popen(["mp4decrypt"],
stderr=subprocess.DEVNULL,
stdout=subprocess.DEVNULL).wait()
return True
except FileNotFoundError:
return False
except Exception as e:
print(
"> Unexpected exception while checking for MP4Decrypt, please tell the program author about this! ",
e)
return True

View File

@@ -0,0 +1 @@
__version__ = '1.2.3-develop'

View File

@@ -8,7 +8,7 @@ def convert(directory, filename):
index = 0
vtt_filepath = os.path.join(directory, filename + ".vtt")
srt_filepath = os.path.join(directory, filename + ".srt")
srt = open(srt_filepath, "w")
srt = open(srt_filepath, 'w')
for caption in WebVTT().read(vtt_filepath):
index += 1

View File

@@ -1,32 +0,0 @@
import mp4parse
import codecs
import widevine_pssh_pb2
import base64
def extract_kid(mp4_file):
"""
Parameters
----------
mp4_file : str
MP4 file with a PSSH header
Returns
-------
String
"""
boxes = mp4parse.F4VParser.parse(filename=mp4_file)
for box in boxes:
if box.header.box_type == 'moov':
pssh_box = next(x for x in box.pssh if x.system_id == "edef8ba979d64acea3c827dcd51d21ed")
hex = codecs.decode(pssh_box.payload, "hex")
pssh = widevine_pssh_pb2.WidevinePsshData()
pssh.ParseFromString(hex)
content_id = base64.b16encode(pssh.content_id)
return content_id.decode("utf-8")
# No Moof or PSSH header found
return None