Compare commits

..

No commits in common. "6c3cfe734060f827bb39c5c55596f0f294b5ab36" and "3697648a15f46ca39a181f9e8ae7e7609d66b43d" have entirely different histories.

21 changed files with 175 additions and 291 deletions

View File

@ -280,7 +280,7 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
* [**ffmpeg** and **ffprobe**](https://www.ffmpeg.org) - Required for [merging separate video and audio files](#format-selection) as well as for various [post-processing](#post-processing-options) tasks. License [depends on the build](https://www.ffmpeg.org/legal.html) * [**ffmpeg** and **ffprobe**](https://www.ffmpeg.org) - Required for [merging separate video and audio files](#format-selection) as well as for various [post-processing](#post-processing-options) tasks. License [depends on the build](https://www.ffmpeg.org/legal.html)
There are bugs in ffmpeg that cause various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for some of these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds There are bugs in ffmpeg that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for some of these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
**Important**: What you need is ffmpeg *binary*, **NOT** [the python package of the same name](https://pypi.org/project/ffmpeg) **Important**: What you need is ffmpeg *binary*, **NOT** [the python package of the same name](https://pypi.org/project/ffmpeg)
@ -1307,8 +1307,7 @@ The available fields are:
- `uploader_id` (string): Nickname or id of the video uploader - `uploader_id` (string): Nickname or id of the video uploader
- `uploader_url` (string): URL to the video uploader's profile - `uploader_url` (string): URL to the video uploader's profile
- `license` (string): License name the video is licensed under - `license` (string): License name the video is licensed under
- `creators` (list): The creators of the video - `creator` (string): The creator of the video
- `creator` (string): The creators of the video; comma-separated
- `timestamp` (numeric): UNIX timestamp of the moment the video became available - `timestamp` (numeric): UNIX timestamp of the moment the video became available
- `upload_date` (string): Video upload date in UTC (YYYYMMDD) - `upload_date` (string): Video upload date in UTC (YYYYMMDD)
- `release_timestamp` (numeric): UNIX timestamp of the moment the video was released - `release_timestamp` (numeric): UNIX timestamp of the moment the video was released
@ -1386,17 +1385,13 @@ Available for the media that is a track or a part of a music album:
- `track` (string): Title of the track - `track` (string): Title of the track
- `track_number` (numeric): Number of the track within an album or a disc - `track_number` (numeric): Number of the track within an album or a disc
- `track_id` (string): Id of the track - `track_id` (string): Id of the track
- `artists` (list): Artist(s) of the track - `artist` (string): Artist(s) of the track
- `artist` (string): Artist(s) of the track; comma-separated - `genre` (string): Genre(s) of the track
- `genres` (list): Genre(s) of the track
- `genre` (string): Genre(s) of the track; comma-separated
- `composers` (list): Composer(s) of the piece
- `composer` (string): Composer(s) of the piece; comma-separated
- `album` (string): Title of the album the track belongs to - `album` (string): Title of the album the track belongs to
- `album_type` (string): Type of the album - `album_type` (string): Type of the album
- `album_artists` (list): All artists appeared on the album - `album_artist` (string): List of all artists appeared on the album
- `album_artist` (string): All artists appeared on the album; comma-separated
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to - `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
- `composer` (string): Name of the composer
Available only when using `--download-sections` and for `chapter:` prefix when using `--split-chapters` for videos with internal chapters: Available only when using `--download-sections` and for `chapter:` prefix when using `--split-chapters` for videos with internal chapters:
@ -1773,11 +1768,10 @@ Metadata fields | From
`description`, `synopsis` | `description` `description`, `synopsis` | `description`
`purl`, `comment` | `webpage_url` `purl`, `comment` | `webpage_url`
`track` | `track_number` `track` | `track_number`
`artist` | `artist`, `artists`, `creator`, `creators`, `uploader` or `uploader_id` `artist` | `artist`, `creator`, `uploader` or `uploader_id`
`composer` | `composer` or `composers` `genre` | `genre`
`genre` | `genre` or `genres`
`album` | `album` `album` | `album`
`album_artist` | `album_artist` or `album_artists` `album_artist` | `album_artist`
`disc` | `disc_number` `disc` | `disc_number`
`show` | `series` `show` | `series`
`season_number` | `season_number` `season_number` | `season_number`

View File

@ -223,10 +223,6 @@ def sanitize_got_info_dict(got_dict):
if test_info_dict.get('display_id') == test_info_dict.get('id'): if test_info_dict.get('display_id') == test_info_dict.get('id'):
test_info_dict.pop('display_id') test_info_dict.pop('display_id')
# Remove deprecated fields
for old in YoutubeDL._deprecated_multivalue_fields.keys():
test_info_dict.pop(old, None)
# release_year may be generated from release_date # release_year may be generated from release_date
if try_call(lambda: test_info_dict['release_year'] == int(test_info_dict['release_date'][:4])): if try_call(lambda: test_info_dict['release_year'] == int(test_info_dict['release_date'][:4])):
test_info_dict.pop('release_year') test_info_dict.pop('release_year')

View File

@ -941,7 +941,7 @@ class TestYoutubeDL(unittest.TestCase):
def get_videos(filter_=None): def get_videos(filter_=None):
ydl = YDL({'match_filter': filter_, 'simulate': True}) ydl = YDL({'match_filter': filter_, 'simulate': True})
for v in videos: for v in videos:
ydl.process_ie_result(v.copy(), download=True) ydl.process_ie_result(v, download=True)
return [v['id'] for v in ydl.downloaded_info_dicts] return [v['id'] for v in ydl.downloaded_info_dicts]
res = get_videos() res = get_videos()

View File

@ -2340,58 +2340,6 @@ Line 1
self.assertEqual(traverse_obj(mobj, lambda k, _: k in (0, 'group')), ['0123', '3'], self.assertEqual(traverse_obj(mobj, lambda k, _: k in (0, 'group')), ['0123', '3'],
msg='function on a `re.Match` should give group name as well') msg='function on a `re.Match` should give group name as well')
# Test xml.etree.ElementTree.Element as input obj
etree = xml.etree.ElementTree.fromstring('''<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank>68</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>''')
self.assertEqual(traverse_obj(etree, ''), etree,
msg='empty str key should return the element itself')
self.assertEqual(traverse_obj(etree, 'country'), list(etree),
msg='str key should lead all children with that tag name')
self.assertEqual(traverse_obj(etree, ...), list(etree),
msg='`...` as key should return all children')
self.assertEqual(traverse_obj(etree, lambda _, x: x[0].text == '4'), [etree[1]],
msg='function as key should get element as value')
self.assertEqual(traverse_obj(etree, lambda i, _: i == 1), [etree[1]],
msg='function as key should get index as key')
self.assertEqual(traverse_obj(etree, 0), etree[0],
msg='int key should return the nth child')
self.assertEqual(traverse_obj(etree, './/neighbor/@name'),
['Austria', 'Switzerland', 'Malaysia', 'Costa Rica', 'Colombia'],
msg='`@<attribute>` at end of path should give that attribute')
self.assertEqual(traverse_obj(etree, '//neighbor/@fail'), [None, None, None, None, None],
msg='`@<nonexistant>` at end of path should give `None`')
self.assertEqual(traverse_obj(etree, ('//neighbor/@', 2)), {'name': 'Malaysia', 'direction': 'N'},
msg='`@` should give the full attribute dict')
self.assertEqual(traverse_obj(etree, '//year/text()'), ['2008', '2011', '2011'],
msg='`text()` at end of path should give the inner text')
self.assertEqual(traverse_obj(etree, '//*[@direction]/@direction'), ['E', 'W', 'N', 'W', 'E'],
msg='full python xpath features should be supported')
self.assertEqual(traverse_obj(etree, (0, '@name')), 'Liechtenstein',
msg='special transformations should act on current element')
self.assertEqual(traverse_obj(etree, ('country', 0, ..., 'text()', {int_or_none})), [1, 2008, 141100],
msg='special transformations should act on current element')
def test_http_header_dict(self): def test_http_header_dict(self):
headers = HTTPHeaderDict() headers = HTTPHeaderDict()
headers['ytdl-test'] = b'0' headers['ytdl-test'] = b'0'

View File

@ -581,13 +581,6 @@ class YoutubeDL:
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'hls_aes', 'downloader_options', 'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'hls_aes', 'downloader_options',
'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time' 'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time'
} }
_deprecated_multivalue_fields = {
'album_artist': 'album_artists',
'artist': 'artists',
'composer': 'composers',
'creator': 'creators',
'genre': 'genres',
}
_format_selection_exts = { _format_selection_exts = {
'audio': set(MEDIA_EXTENSIONS.common_audio), 'audio': set(MEDIA_EXTENSIONS.common_audio),
'video': set(MEDIA_EXTENSIONS.common_video + ('3gp', )), 'video': set(MEDIA_EXTENSIONS.common_video + ('3gp', )),
@ -2648,14 +2641,6 @@ class YoutubeDL:
if final and info_dict.get('%s_number' % field) is not None and not info_dict.get(field): if final and info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field]) info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
for old_key, new_key in self._deprecated_multivalue_fields.items():
if new_key in info_dict and old_key in info_dict:
self.deprecation_warning(f'Do not return {old_key!r} when {new_key!r} is present')
elif old_value := info_dict.get(old_key):
info_dict[new_key] = old_value.split(', ')
elif new_value := info_dict.get(new_key):
info_dict[old_key] = ', '.join(v.replace(',', '\N{FULLWIDTH COMMA}') for v in new_value)
def _raise_pending_errors(self, info): def _raise_pending_errors(self, info):
err = info.pop('__pending_error', None) err = info.pop('__pending_error', None)
if err: if err:

View File

@ -186,7 +186,7 @@ def _firefox_browser_dir():
if sys.platform in ('cygwin', 'win32'): if sys.platform in ('cygwin', 'win32'):
return os.path.expandvars(R'%APPDATA%\Mozilla\Firefox\Profiles') return os.path.expandvars(R'%APPDATA%\Mozilla\Firefox\Profiles')
elif sys.platform == 'darwin': elif sys.platform == 'darwin':
return os.path.expanduser('~/Library/Application Support/Firefox/Profiles') return os.path.expanduser('~/Library/Application Support/Firefox')
return os.path.expanduser('~/.mozilla/firefox') return os.path.expanduser('~/.mozilla/firefox')

View File

@ -2019,6 +2019,7 @@ from .tunein import (
TuneInPodcastEpisodeIE, TuneInPodcastEpisodeIE,
TuneInShortenerIE, TuneInShortenerIE,
) )
from .turbo import TurboIE
from .tv2 import ( from .tv2 import (
TV2IE, TV2IE,
TV2ArticleIE, TV2ArticleIE,
@ -2222,7 +2223,6 @@ from .viki import (
VikiIE, VikiIE,
VikiChannelIE, VikiChannelIE,
) )
from .viously import ViouslyIE
from .viqeo import ViqeoIE from .viqeo import ViqeoIE
from .viu import ( from .viu import (
ViuIE, ViuIE,

View File

@ -31,7 +31,6 @@ from ..utils import (
unified_timestamp, unified_timestamp,
url_or_none, url_or_none,
urlhandle_detect_ext, urlhandle_detect_ext,
variadic,
) )
@ -50,7 +49,7 @@ class ArchiveOrgIE(InfoExtractor):
'release_date': '19681210', 'release_date': '19681210',
'timestamp': 1268695290, 'timestamp': 1268695290,
'upload_date': '20100315', 'upload_date': '20100315',
'creators': ['SRI International'], 'creator': 'SRI International',
'uploader': 'laura@archive.org', 'uploader': 'laura@archive.org',
'thumbnail': r're:https://archive\.org/download/.*\.jpg', 'thumbnail': r're:https://archive\.org/download/.*\.jpg',
'display_id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.cdr', 'display_id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.cdr',
@ -110,7 +109,7 @@ class ArchiveOrgIE(InfoExtractor):
'title': 'Turning', 'title': 'Turning',
'ext': 'flac', 'ext': 'flac',
'track': 'Turning', 'track': 'Turning',
'creators': ['Grateful Dead'], 'creator': 'Grateful Dead',
'display_id': 'gd1977-05-08d01t01.flac', 'display_id': 'gd1977-05-08d01t01.flac',
'track_number': 1, 'track_number': 1,
'album': '1977-05-08 - Barton Hall - Cornell University', 'album': '1977-05-08 - Barton Hall - Cornell University',
@ -130,7 +129,7 @@ class ArchiveOrgIE(InfoExtractor):
'location': 'Barton Hall - Cornell University', 'location': 'Barton Hall - Cornell University',
'duration': 438.68, 'duration': 438.68,
'track': 'Deal', 'track': 'Deal',
'creators': ['Grateful Dead'], 'creator': 'Grateful Dead',
'album': '1977-05-08 - Barton Hall - Cornell University', 'album': '1977-05-08 - Barton Hall - Cornell University',
'release_date': '19770508', 'release_date': '19770508',
'display_id': 'gd1977-05-08d01t07.flac', 'display_id': 'gd1977-05-08d01t07.flac',
@ -168,7 +167,7 @@ class ArchiveOrgIE(InfoExtractor):
'upload_date': '20160610', 'upload_date': '20160610',
'description': 'md5:f70956a156645a658a0dc9513d9e78b7', 'description': 'md5:f70956a156645a658a0dc9513d9e78b7',
'uploader': 'dimitrios@archive.org', 'uploader': 'dimitrios@archive.org',
'creators': ['British Broadcasting Corporation', 'Time-Life Films'], 'creator': 'British Broadcasting Corporation, Time-Life Films',
'timestamp': 1465594947, 'timestamp': 1465594947,
}, },
'playlist': [ 'playlist': [
@ -258,7 +257,8 @@ class ArchiveOrgIE(InfoExtractor):
'title': m['title'], 'title': m['title'],
'description': clean_html(m.get('description')), 'description': clean_html(m.get('description')),
'uploader': dict_get(m, ['uploader', 'adder']), 'uploader': dict_get(m, ['uploader', 'adder']),
'creators': traverse_obj(m, ('creator', {variadic}, {lambda x: x[0] and list(x)})), 'creator': traverse_obj(m, (
'creator', (({list}, {lambda x: join_nonempty(*x, delim=', ')}), {str})), get_all=False) or None,
'license': m.get('licenseurl'), 'license': m.get('licenseurl'),
'release_date': unified_strdate(m.get('date')), 'release_date': unified_strdate(m.get('date')),
'timestamp': unified_timestamp(dict_get(m, ['publicdate', 'addeddate'])), 'timestamp': unified_timestamp(dict_get(m, ['publicdate', 'addeddate'])),
@ -273,7 +273,8 @@ class ArchiveOrgIE(InfoExtractor):
'title': f.get('title') or f['name'], 'title': f.get('title') or f['name'],
'display_id': f['name'], 'display_id': f['name'],
'description': clean_html(f.get('description')), 'description': clean_html(f.get('description')),
'creators': traverse_obj(f, ('creator', {variadic}, {lambda x: x[0] and list(x)})), 'creator': traverse_obj(f, (
'creator', (({list}, {lambda x: join_nonempty(*x, delim=', ')}), {str})), get_all=False) or None,
'duration': parse_duration(f.get('length')), 'duration': parse_duration(f.get('length')),
'track_number': int_or_none(f.get('track')), 'track_number': int_or_none(f.get('track')),
'album': f.get('album'), 'album': f.get('album'),

View File

@ -4,7 +4,6 @@ from functools import partial
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
OnDemandPagedList, OnDemandPagedList,
bug_reports_message,
determine_ext, determine_ext,
int_or_none, int_or_none,
join_nonempty, join_nonempty,
@ -234,7 +233,7 @@ class ARDBetaMediathekIE(InfoExtractor):
(?:(?:beta|www)\.)?ardmediathek\.de/ (?:(?:beta|www)\.)?ardmediathek\.de/
(?:[^/]+/)? (?:[^/]+/)?
(?:player|live|video)/ (?:player|live|video)/
(?:[^?#]+/)? (?:(?P<display_id>[^?#]+)/)?
(?P<id>[a-zA-Z0-9]+) (?P<id>[a-zA-Z0-9]+)
/?(?:[?#]|$)''' /?(?:[?#]|$)'''
_GEO_COUNTRIES = ['DE'] _GEO_COUNTRIES = ['DE']
@ -243,8 +242,8 @@ class ARDBetaMediathekIE(InfoExtractor):
'url': 'https://www.ardmediathek.de/video/filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen/Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0', 'url': 'https://www.ardmediathek.de/video/filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen/Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0',
'md5': 'b6e8ab03f2bcc6e1f9e6cef25fcc03c4', 'md5': 'b6e8ab03f2bcc6e1f9e6cef25fcc03c4',
'info_dict': { 'info_dict': {
'display_id': 'Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0', 'display_id': 'filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen',
'id': '12939099', 'id': 'Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0',
'title': 'Liebe auf vier Pfoten', 'title': 'Liebe auf vier Pfoten',
'description': r're:^Claudia Schmitt, Anwältin in Salzburg', 'description': r're:^Claudia Schmitt, Anwältin in Salzburg',
'duration': 5222, 'duration': 5222,
@ -256,7 +255,7 @@ class ARDBetaMediathekIE(InfoExtractor):
'series': 'Filme im MDR', 'series': 'Filme im MDR',
'age_limit': 0, 'age_limit': 0,
'channel': 'MDR', 'channel': 'MDR',
'_old_archive_ids': ['ardbetamediathek Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0'], '_old_archive_ids': ['ardbetamediathek 12939099'],
}, },
}, { }, {
'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/', 'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/',
@ -277,37 +276,37 @@ class ARDBetaMediathekIE(InfoExtractor):
'url': 'https://www.ardmediathek.de/video/tagesschau-oder-tagesschau-20-00-uhr/das-erste/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll', 'url': 'https://www.ardmediathek.de/video/tagesschau-oder-tagesschau-20-00-uhr/das-erste/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll',
'md5': '1e73ded21cb79bac065117e80c81dc88', 'md5': '1e73ded21cb79bac065117e80c81dc88',
'info_dict': { 'info_dict': {
'id': '10049223', 'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll',
'ext': 'mp4', 'ext': 'mp4',
'title': 'tagesschau, 20:00 Uhr', 'title': 'tagesschau, 20:00 Uhr',
'timestamp': 1636398000, 'timestamp': 1636398000,
'description': 'md5:39578c7b96c9fe50afdf5674ad985e6b', 'description': 'md5:39578c7b96c9fe50afdf5674ad985e6b',
'upload_date': '20211108', 'upload_date': '20211108',
'display_id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll', 'display_id': 'tagesschau-oder-tagesschau-20-00-uhr/das-erste',
'duration': 915, 'duration': 915,
'episode': 'tagesschau, 20:00 Uhr', 'episode': 'tagesschau, 20:00 Uhr',
'series': 'tagesschau', 'series': 'tagesschau',
'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:fbb21142783b0a49?w=960&ch=ee69108ae344f678', 'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:fbb21142783b0a49?w=960&ch=ee69108ae344f678',
'channel': 'ARD-Aktuell', 'channel': 'ARD-Aktuell',
'_old_archive_ids': ['ardbetamediathek Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll'], '_old_archive_ids': ['ardbetamediathek 10049223'],
}, },
}, { }, {
'url': 'https://www.ardmediathek.de/video/7-tage/7-tage-unter-harten-jungs/hr-fernsehen/N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3', 'url': 'https://www.ardmediathek.de/video/7-tage/7-tage-unter-harten-jungs/hr-fernsehen/N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3',
'md5': 'c428b9effff18ff624d4f903bda26315', 'md5': 'c428b9effff18ff624d4f903bda26315',
'info_dict': { 'info_dict': {
'id': '94834686', 'id': 'N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3',
'ext': 'mp4', 'ext': 'mp4',
'duration': 2700, 'duration': 2700,
'episode': '7 Tage ... unter harten Jungs', 'episode': '7 Tage ... unter harten Jungs',
'description': 'md5:0f215470dcd2b02f59f4bd10c963f072', 'description': 'md5:0f215470dcd2b02f59f4bd10c963f072',
'upload_date': '20231005', 'upload_date': '20231005',
'timestamp': 1696491171, 'timestamp': 1696491171,
'display_id': 'N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3', 'display_id': '7-tage/7-tage-unter-harten-jungs/hr-fernsehen',
'series': '7 Tage ...', 'series': '7 Tage ...',
'channel': 'HR', 'channel': 'HR',
'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:f6e6d5ffac41925c?w=960&ch=fa32ba69bc87989a', 'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:f6e6d5ffac41925c?w=960&ch=fa32ba69bc87989a',
'title': '7 Tage ... unter harten Jungs', 'title': '7 Tage ... unter harten Jungs',
'_old_archive_ids': ['ardbetamediathek N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3'], '_old_archive_ids': ['ardbetamediathek 94834686'],
}, },
}, { }, {
'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE', 'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
@ -358,25 +357,14 @@ class ARDBetaMediathekIE(InfoExtractor):
}), get_all=False) }), get_all=False)
def _real_extract(self, url): def _real_extract(self, url):
display_id = self._match_id(url) video_id, display_id = self._match_valid_url(url).group('id', 'display_id')
page_data = self._download_json( page_data = self._download_json(
f'https://api.ardmediathek.de/page-gateway/pages/ard/item/{display_id}', display_id, query={ f'https://api.ardmediathek.de/page-gateway/pages/ard/item/{video_id}', video_id, query={
'embedded': 'false', 'embedded': 'false',
'mcV6': 'true', 'mcV6': 'true',
}) })
# For user convenience we use the old contentId instead of the longer crid
# Ref: https://github.com/yt-dlp/yt-dlp/issues/8731#issuecomment-1874398283
old_id = traverse_obj(page_data, ('tracking', 'atiCustomVars', 'contentId', {int}))
if old_id is not None:
video_id = str(old_id)
archive_ids = [make_archive_id(ARDBetaMediathekIE, display_id)]
else:
self.report_warning(f'Could not extract contentId{bug_reports_message()}')
video_id = display_id
archive_ids = None
player_data = traverse_obj( player_data = traverse_obj(
page_data, ('widgets', lambda _, v: v['type'] in ('player_ondemand', 'player_live'), {dict}), get_all=False) page_data, ('widgets', lambda _, v: v['type'] in ('player_ondemand', 'player_live'), {dict}), get_all=False)
is_live = player_data.get('type') == 'player_live' is_live = player_data.get('type') == 'player_live'
@ -431,6 +419,8 @@ class ARDBetaMediathekIE(InfoExtractor):
}) })
age_limit = traverse_obj(page_data, ('fskRating', {lambda x: remove_start(x, 'FSK')}, {int_or_none})) age_limit = traverse_obj(page_data, ('fskRating', {lambda x: remove_start(x, 'FSK')}, {int_or_none}))
old_id = traverse_obj(page_data, ('tracking', 'atiCustomVars', 'contentId'))
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
@ -448,7 +438,7 @@ class ARDBetaMediathekIE(InfoExtractor):
'channel': 'clipSourceName', 'channel': 'clipSourceName',
})), })),
**self._extract_episode_info(page_data.get('title')), **self._extract_episode_info(page_data.get('title')),
'_old_archive_ids': archive_ids, '_old_archive_ids': [make_archive_id(ARDBetaMediathekIE, old_id)],
} }

View File

@ -278,7 +278,7 @@ class InfoExtractor:
description: Full video description. description: Full video description.
uploader: Full name of the video uploader. uploader: Full name of the video uploader.
license: License name the video is licensed under. license: License name the video is licensed under.
creators: List of creators of the video. creator: The creator of the video.
timestamp: UNIX timestamp of the moment the video was uploaded timestamp: UNIX timestamp of the moment the video was uploaded
upload_date: Video upload date in UTC (YYYYMMDD). upload_date: Video upload date in UTC (YYYYMMDD).
If not explicitly set, calculated from timestamp If not explicitly set, calculated from timestamp
@ -422,16 +422,16 @@ class InfoExtractor:
track_number: Number of the track within an album or a disc, as an integer. track_number: Number of the track within an album or a disc, as an integer.
track_id: Id of the track (useful in case of custom indexing, e.g. 6.iii), track_id: Id of the track (useful in case of custom indexing, e.g. 6.iii),
as a unicode string. as a unicode string.
artists: List of artists of the track. artist: Artist(s) of the track.
composers: List of composers of the piece. genre: Genre(s) of the track.
genres: List of genres of the track.
album: Title of the album the track belongs to. album: Title of the album the track belongs to.
album_type: Type of the album (e.g. "Demo", "Full-length", "Split", "Compilation", etc). album_type: Type of the album (e.g. "Demo", "Full-length", "Split", "Compilation", etc).
album_artists: List of all artists appeared on the album. album_artist: List of all artists appeared on the album (e.g.
E.g. ["Ash Borer", "Fell Voices"] or ["Various Artists"]. "Ash Borer / Fell Voices" or "Various Artists", useful for splits
Useful for splits and compilations. and compilations).
disc_number: Number of the disc or other physical medium the track belongs to, disc_number: Number of the disc or other physical medium the track belongs to,
as an integer. as an integer.
composer: Composer of the piece
The following fields should only be set for clips that should be cut from the original video: The following fields should only be set for clips that should be cut from the original video:
@ -442,18 +442,6 @@ class InfoExtractor:
rows: Number of rows in each storyboard fragment, as an integer rows: Number of rows in each storyboard fragment, as an integer
columns: Number of columns in each storyboard fragment, as an integer columns: Number of columns in each storyboard fragment, as an integer
The following fields are deprecated and should not be set by new code:
composer: Use "composers" instead.
Composer(s) of the piece, comma-separated.
artist: Use "artists" instead.
Artist(s) of the track, comma-separated.
genre: Use "genres" instead.
Genre(s) of the track, comma-separated.
album_artist: Use "album_artists" instead.
All artists appeared on the album, comma-separated.
creator: Use "creators" instead.
The creator of the video.
Unless mentioned otherwise, the fields should be Unicode strings. Unless mentioned otherwise, the fields should be Unicode strings.
Unless mentioned otherwise, None is equivalent to absence of information. Unless mentioned otherwise, None is equivalent to absence of information.

View File

@ -514,7 +514,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'track': 'Egaono Hana', 'track': 'Egaono Hana',
'artist': 'Goose house', 'artist': 'Goose house',
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$', 'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'genres': ['J-Pop'], 'genre': ['J-Pop'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@ -527,7 +527,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'track': 'Crossing Field', 'track': 'Crossing Field',
'artist': 'LiSA', 'artist': 'LiSA',
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$', 'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'genres': ['Anime'], 'genre': ['Anime'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@ -541,7 +541,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'artist': 'LiSA', 'artist': 'LiSA',
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$', 'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'description': 'md5:747444e7e6300907b7a43f0a0503072e', 'description': 'md5:747444e7e6300907b7a43f0a0503072e',
'genres': ['J-Pop'], 'genre': ['J-Pop'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@ -594,7 +594,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'width': ('width', {int_or_none}), 'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}), 'height': ('height', {int_or_none}),
}), }),
'genres': ('genres', ..., 'displayValue'), 'genre': ('genres', ..., 'displayValue'),
'age_limit': ('maturity_ratings', -1, {parse_age_limit}), 'age_limit': ('maturity_ratings', -1, {parse_age_limit}),
}), }),
} }
@ -611,7 +611,7 @@ class CrunchyrollArtistIE(CrunchyrollBaseIE):
'info_dict': { 'info_dict': {
'id': 'MA179CB50D', 'id': 'MA179CB50D',
'title': 'LiSA', 'title': 'LiSA',
'genres': ['J-Pop', 'Anime', 'Rock'], 'genre': ['J-Pop', 'Anime', 'Rock'],
'description': 'md5:16d87de61a55c3f7d6c454b73285938e', 'description': 'md5:16d87de61a55c3f7d6c454b73285938e',
}, },
'playlist_mincount': 83, 'playlist_mincount': 83,
@ -645,6 +645,6 @@ class CrunchyrollArtistIE(CrunchyrollBaseIE):
'width': ('width', {int_or_none}), 'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}), 'height': ('height', {int_or_none}),
}), }),
'genres': ('genres', ..., 'displayValue'), 'genre': ('genres', ..., 'displayValue'),
}), }),
} }

View File

@ -9,7 +9,7 @@ class MonsterSirenHypergryphMusicIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '514562', 'id': '514562',
'ext': 'wav', 'ext': 'wav',
'artists': ['塞壬唱片-MSR'], 'artist': ['塞壬唱片-MSR'],
'album': 'Flame Shadow', 'album': 'Flame Shadow',
'title': 'Flame Shadow', 'title': 'Flame Shadow',
} }
@ -27,6 +27,6 @@ class MonsterSirenHypergryphMusicIE(InfoExtractor):
'url': traverse_obj(json_data, ('player', 'songDetail', 'sourceUrl')), 'url': traverse_obj(json_data, ('player', 'songDetail', 'sourceUrl')),
'ext': 'wav', 'ext': 'wav',
'vcodec': 'none', 'vcodec': 'none',
'artists': traverse_obj(json_data, ('player', 'songDetail', 'artists', ...)), 'artist': traverse_obj(json_data, ('player', 'songDetail', 'artists')),
'album': traverse_obj(json_data, ('musicPlay', 'albumDetail', 'name')) 'album': traverse_obj(json_data, ('musicPlay', 'albumDetail', 'name'))
} }

View File

@ -17,11 +17,11 @@ class MusicdexBaseIE(InfoExtractor):
'track_number': track_json.get('number'), 'track_number': track_json.get('number'),
'url': format_field(track_json, 'url', 'https://www.musicdex.org/%s'), 'url': format_field(track_json, 'url', 'https://www.musicdex.org/%s'),
'duration': track_json.get('duration'), 'duration': track_json.get('duration'),
'genres': [genre.get('name') for genre in track_json.get('genres') or []], 'genre': [genre.get('name') for genre in track_json.get('genres') or []],
'like_count': track_json.get('likes_count'), 'like_count': track_json.get('likes_count'),
'view_count': track_json.get('plays'), 'view_count': track_json.get('plays'),
'artists': [artist.get('name') for artist in track_json.get('artists') or []], 'artist': [artist.get('name') for artist in track_json.get('artists') or []],
'album_artists': [artist.get('name') for artist in album_json.get('artists') or []], 'album_artist': [artist.get('name') for artist in album_json.get('artists') or []],
'thumbnail': format_field(album_json, 'image', 'https://www.musicdex.org/%s'), 'thumbnail': format_field(album_json, 'image', 'https://www.musicdex.org/%s'),
'album': album_json.get('name'), 'album': album_json.get('name'),
'release_year': try_get(album_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year), 'release_year': try_get(album_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year),
@ -43,11 +43,11 @@ class MusicdexSongIE(MusicdexBaseIE):
'track': 'dual existence', 'track': 'dual existence',
'track_number': 1, 'track_number': 1,
'duration': 266000, 'duration': 266000,
'genres': ['Anime'], 'genre': ['Anime'],
'like_count': int, 'like_count': int,
'view_count': int, 'view_count': int,
'artists': ['fripSide'], 'artist': ['fripSide'],
'album_artists': ['fripSide'], 'album_artist': ['fripSide'],
'thumbnail': 'https://www.musicdex.org/storage/album/9iDIam1DHTVqUG4UclFIEq1WAFGXfPW4y0TtZa91.png', 'thumbnail': 'https://www.musicdex.org/storage/album/9iDIam1DHTVqUG4UclFIEq1WAFGXfPW4y0TtZa91.png',
'album': 'To Aru Kagaku no Railgun T OP2 Single - dual existence', 'album': 'To Aru Kagaku no Railgun T OP2 Single - dual existence',
'release_year': 2020 'release_year': 2020
@ -69,9 +69,9 @@ class MusicdexAlbumIE(MusicdexBaseIE):
'playlist_mincount': 28, 'playlist_mincount': 28,
'info_dict': { 'info_dict': {
'id': '56', 'id': '56',
'genres': ['OST'], 'genre': ['OST'],
'view_count': int, 'view_count': int,
'artists': ['TENMON & Eiichiro Yanagi / minori'], 'artist': ['TENMON & Eiichiro Yanagi / minori'],
'title': 'ef - a tale of memories Original Soundtrack 2 ~fortissimo~', 'title': 'ef - a tale of memories Original Soundtrack 2 ~fortissimo~',
'release_year': 2008, 'release_year': 2008,
'thumbnail': 'https://www.musicdex.org/storage/album/2rSHkyYBYfB7sbvElpEyTMcUn6toY7AohOgJuDlE.jpg', 'thumbnail': 'https://www.musicdex.org/storage/album/2rSHkyYBYfB7sbvElpEyTMcUn6toY7AohOgJuDlE.jpg',
@ -88,9 +88,9 @@ class MusicdexAlbumIE(MusicdexBaseIE):
'id': id, 'id': id,
'title': data_json.get('name'), 'title': data_json.get('name'),
'description': data_json.get('description'), 'description': data_json.get('description'),
'genres': [genre.get('name') for genre in data_json.get('genres') or []], 'genre': [genre.get('name') for genre in data_json.get('genres') or []],
'view_count': data_json.get('plays'), 'view_count': data_json.get('plays'),
'artists': [artist.get('name') for artist in data_json.get('artists') or []], 'artist': [artist.get('name') for artist in data_json.get('artists') or []],
'thumbnail': format_field(data_json, 'image', 'https://www.musicdex.org/%s'), 'thumbnail': format_field(data_json, 'image', 'https://www.musicdex.org/%s'),
'release_year': try_get(data_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year), 'release_year': try_get(data_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year),
'entries': entries, 'entries': entries,

View File

@ -665,7 +665,7 @@ class NhkRadiruLiveIE(InfoExtractor):
noa_info = self._download_json( noa_info = self._download_json(
f'https:{config.find(".//url_program_noa").text}'.format(area=data.find('areakey').text), f'https:{config.find(".//url_program_noa").text}'.format(area=data.find('areakey').text),
station, note=f'Downloading {area} station metadata', fatal=False) station, note=f'Downloading {area} station metadata')
present_info = traverse_obj(noa_info, ('nowonair_list', self._NOA_STATION_IDS.get(station), 'present')) present_info = traverse_obj(noa_info, ('nowonair_list', self._NOA_STATION_IDS.get(station), 'present'))
return { return {

View File

@ -21,7 +21,7 @@ class StagePlusVODConcertIE(InfoExtractor):
'id': 'vod_concert_APNM8GRFDPHMASJKBSPJACG', 'id': 'vod_concert_APNM8GRFDPHMASJKBSPJACG',
'title': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz', 'title': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz',
'description': 'md5:50f78ec180518c9bdb876bac550996fc', 'description': 'md5:50f78ec180518c9bdb876bac550996fc',
'artists': ['Yuja Wang', 'Lorenzo Viotti'], 'artist': ['Yuja Wang', 'Lorenzo Viotti'],
'upload_date': '20230331', 'upload_date': '20230331',
'timestamp': 1680249600, 'timestamp': 1680249600,
'release_date': '20210709', 'release_date': '20210709',
@ -40,10 +40,10 @@ class StagePlusVODConcertIE(InfoExtractor):
'release_timestamp': 1625788800, 'release_timestamp': 1625788800,
'duration': 2207, 'duration': 2207,
'chapters': 'count:5', 'chapters': 'count:5',
'artists': ['Yuja Wang'], 'artist': ['Yuja Wang'],
'composers': ['Sergei Rachmaninoff'], 'composer': ['Sergei Rachmaninoff'],
'album': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz', 'album': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz',
'album_artists': ['Yuja Wang', 'Lorenzo Viotti'], 'album_artist': ['Yuja Wang', 'Lorenzo Viotti'],
'track': 'Piano Concerto No. 2 in C Minor, Op. 18', 'track': 'Piano Concerto No. 2 in C Minor, Op. 18',
'track_number': 1, 'track_number': 1,
'genre': 'Instrumental Concerto', 'genre': 'Instrumental Concerto',
@ -474,7 +474,7 @@ fragment BannerFields on Banner {
metadata = traverse_obj(data, { metadata = traverse_obj(data, {
'title': 'title', 'title': 'title',
'description': ('shortDescription', {str}), 'description': ('shortDescription', {str}),
'artists': ('artists', 'edges', ..., 'node', 'name'), 'artist': ('artists', 'edges', ..., 'node', 'name'),
'timestamp': ('archiveReleaseDate', {unified_timestamp}), 'timestamp': ('archiveReleaseDate', {unified_timestamp}),
'release_timestamp': ('productionDate', {unified_timestamp}), 'release_timestamp': ('productionDate', {unified_timestamp}),
}) })
@ -494,7 +494,7 @@ fragment BannerFields on Banner {
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'album': metadata.get('title'), 'album': metadata.get('title'),
'album_artists': metadata.get('artist'), 'album_artist': metadata.get('artist'),
'track_number': idx, 'track_number': idx,
**metadata, **metadata,
**traverse_obj(video, { **traverse_obj(video, {
@ -506,8 +506,8 @@ fragment BannerFields on Banner {
'title': 'title', 'title': 'title',
'start_time': ('mark', {float_or_none}), 'start_time': ('mark', {float_or_none}),
}), }),
'artists': ('artists', 'edges', ..., 'node', 'name'), 'artist': ('artists', 'edges', ..., 'node', 'name'),
'composers': ('work', 'composers', ..., 'name'), 'composer': ('work', 'composers', ..., 'name'),
'genre': ('work', 'genre', 'title'), 'genre': ('work', 'genre', 'title'),
}), }),
}) })

64
yt_dlp/extractor/turbo.py Normal file
View File

@ -0,0 +1,64 @@
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
qualities,
xpath_text,
)
class TurboIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?turbo\.fr/videos-voiture/(?P<id>[0-9]+)-'
_API_URL = 'http://www.turbo.fr/api/tv/xml.php?player_generique=player_generique&id={0:}'
_TEST = {
'url': 'http://www.turbo.fr/videos-voiture/454443-turbo-du-07-09-2014-renault-twingo-3-bentley-continental-gt-speed-ces-guide-achat-dacia.html',
'md5': '33f4b91099b36b5d5a91f84b5bcba600',
'info_dict': {
'id': '454443',
'ext': 'mp4',
'duration': 3715,
'title': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia... ',
'description': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
'thumbnail': r're:^https?://.*\.jpg$',
}
}
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
playlist = self._download_xml(self._API_URL.format(video_id), video_id)
item = playlist.find('./channel/item')
if item is None:
raise ExtractorError('Playlist item was not found', expected=True)
title = xpath_text(item, './title', 'title')
duration = int_or_none(xpath_text(item, './durate', 'duration'))
thumbnail = xpath_text(item, './visuel_clip', 'thumbnail')
description = self._html_search_meta('description', webpage)
formats = []
get_quality = qualities(['3g', 'sd', 'hq'])
for child in item:
m = re.search(r'url_video_(?P<quality>.+)', child.tag)
if m:
quality = compat_str(m.group('quality'))
formats.append({
'format_id': quality,
'url': child.text,
'quality': get_quality(quality),
})
return {
'id': video_id,
'title': title,
'duration': duration,
'thumbnail': thumbnail,
'description': description,
'formats': formats,
}

View File

@ -8,6 +8,7 @@ from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_parse_qs, compat_parse_qs,
compat_str, compat_str,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse, compat_urllib_parse_urlparse,
) )
from ..utils import ( from ..utils import (
@ -190,20 +191,6 @@ class TwitchBaseIE(InfoExtractor):
'url': thumbnail, 'url': thumbnail,
}] if thumbnail else None }] if thumbnail else None
def _extract_twitch_m3u8_formats(self, video_id, token, signature):
"""Subclasses must define _M3U8_PATH"""
return self._extract_m3u8_formats(
f'{self._USHER_BASE}/{self._M3U8_PATH}/{video_id}.m3u8', video_id, 'mp4', query={
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'sig': signature,
'token': token,
})
class TwitchVodIE(TwitchBaseIE): class TwitchVodIE(TwitchBaseIE):
IE_NAME = 'twitch:vod' IE_NAME = 'twitch:vod'
@ -216,7 +203,6 @@ class TwitchVodIE(TwitchBaseIE):
) )
(?P<id>\d+) (?P<id>\d+)
''' '''
_M3U8_PATH = 'vod'
_TESTS = [{ _TESTS = [{
'url': 'http://www.twitch.tv/riotgames/v/6528877?t=5m10s', 'url': 'http://www.twitch.tv/riotgames/v/6528877?t=5m10s',
@ -546,8 +532,20 @@ class TwitchVodIE(TwitchBaseIE):
info = self._extract_info_gql(video, vod_id) info = self._extract_info_gql(video, vod_id)
access_token = self._download_access_token(vod_id, 'video', 'id') access_token = self._download_access_token(vod_id, 'video', 'id')
formats = self._extract_twitch_m3u8_formats( formats = self._extract_m3u8_formats(
vod_id, access_token['value'], access_token['signature']) '%s/vod/%s.m3u8?%s' % (
self._USHER_BASE, vod_id,
compat_urllib_parse_urlencode({
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'nauth': access_token['value'],
'nauthsig': access_token['signature'],
})),
vod_id, 'mp4', entry_protocol='m3u8_native')
formats.extend(self._extract_storyboard(vod_id, video.get('storyboard'), info.get('duration'))) formats.extend(self._extract_storyboard(vod_id, video.get('storyboard'), info.get('duration')))
self._prefer_source(formats) self._prefer_source(formats)
@ -926,7 +924,6 @@ class TwitchStreamIE(TwitchBaseIE):
) )
(?P<id>[^/#?]+) (?P<id>[^/#?]+)
''' '''
_M3U8_PATH = 'api/channel/hls'
_TESTS = [{ _TESTS = [{
'url': 'http://www.twitch.tv/shroomztv', 'url': 'http://www.twitch.tv/shroomztv',
@ -1029,10 +1026,23 @@ class TwitchStreamIE(TwitchBaseIE):
access_token = self._download_access_token( access_token = self._download_access_token(
channel_name, 'stream', 'channelName') channel_name, 'stream', 'channelName')
token = access_token['value']
stream_id = stream.get('id') or channel_name stream_id = stream.get('id') or channel_name
formats = self._extract_twitch_m3u8_formats( query = {
channel_name, access_token['value'], access_token['signature']) 'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'segment_preference': '4',
'sig': access_token['signature'].encode('utf-8'),
'token': token.encode('utf-8'),
}
formats = self._extract_m3u8_formats(
'%s/api/channel/hls/%s.m3u8' % (self._USHER_BASE, channel_name),
stream_id, 'mp4', query=query)
self._prefer_source(formats) self._prefer_source(formats)
view_count = stream.get('viewers') view_count = stream.get('viewers')

View File

@ -1,60 +0,0 @@
import base64
import re
from .common import InfoExtractor
from ..utils import (
extract_attributes,
int_or_none,
parse_iso8601,
)
from ..utils.traversal import traverse_obj
class ViouslyIE(InfoExtractor):
_VALID_URL = False
_WEBPAGE_TESTS = [{
'url': 'http://www.turbo.fr/videos-voiture/454443-turbo-du-07-09-2014-renault-twingo-3-bentley-continental-gt-speed-ces-guide-achat-dacia.html',
'md5': '37a6c3381599381ff53a7e1e0575c0bc',
'info_dict': {
'id': 'F_xQzS2jwb3',
'ext': 'mp4',
'title': 'Turbo du 07/09/2014\xa0: Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
'description': 'Turbo du 07/09/2014\xa0: Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
'age_limit': 0,
'upload_date': '20230328',
'timestamp': 1680037507,
'duration': 3716,
'categories': ['motors'],
}
}]
def _extract_from_webpage(self, url, webpage):
viously_players = re.findall(r'<div[^>]*class="(?:[^"]*\s)?v(?:iou)?sly-player(?:\s[^"]*)?"[^>]*>', webpage)
if not viously_players:
return
def custom_decode(text):
STANDARD_ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/='
CUSTOM_ALPHABET = 'VIOUSLYABCDEFGHJKMNPQRTWXZviouslyabcdefghjkmnpqrtwxz9876543210+/='
data = base64.b64decode(text.translate(str.maketrans(CUSTOM_ALPHABET, STANDARD_ALPHABET)))
return data.decode('utf-8').strip('\x00')
for video_id in traverse_obj(viously_players, (..., {extract_attributes}, 'id')):
formats = self._extract_m3u8_formats(
f'https://www.viously.com/video/hls/{video_id}/index.m3u8', video_id, fatal=False)
if not formats:
continue
data = self._download_json(
f'https://www.viously.com/export/json/{video_id}', video_id,
transform_source=custom_decode, fatal=False)
yield {
'id': video_id,
'formats': formats,
**traverse_obj(data, ('video', {
'title': ('title', {str}),
'description': ('description', {str}),
'duration': ('duration', {int_or_none}),
'timestamp': ('iso_date', {parse_iso8601}),
'categories': ('category', 'name', {str}, {lambda x: [x] if x else None}),
})),
}

View File

@ -2068,8 +2068,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'title': 'Voyeur Girl', 'title': 'Voyeur Girl',
'description': 'md5:7ae382a65843d6df2685993e90a8628f', 'description': 'md5:7ae382a65843d6df2685993e90a8628f',
'upload_date': '20190312', 'upload_date': '20190312',
'artists': ['Stephen'], 'artist': 'Stephen',
'creators': ['Stephen'],
'track': 'Voyeur Girl', 'track': 'Voyeur Girl',
'album': 'it\'s too much love to know my dear', 'album': 'it\'s too much love to know my dear',
'release_date': '20190313', 'release_date': '20190313',
@ -2082,6 +2081,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'channel': 'Stephen', # TODO: should be "Stephen - Topic" 'channel': 'Stephen', # TODO: should be "Stephen - Topic"
'uploader': 'Stephen', 'uploader': 'Stephen',
'availability': 'public', 'availability': 'public',
'creator': 'Stephen',
'duration': 169, 'duration': 169,
'thumbnail': 'https://i.ytimg.com/vi_webp/MgNrAu2pzNs/maxresdefault.webp', 'thumbnail': 'https://i.ytimg.com/vi_webp/MgNrAu2pzNs/maxresdefault.webp',
'age_limit': 0, 'age_limit': 0,
@ -4386,8 +4386,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
release_year = release_date[:4] release_year = release_date[:4]
info.update({ info.update({
'album': mobj.group('album'.strip()), 'album': mobj.group('album'.strip()),
'artists': ([a] if (a := mobj.group('clean_artist')) 'artist': mobj.group('clean_artist') or ', '.join(a.strip() for a in mobj.group('artist').split('·')),
else [a.strip() for a in mobj.group('artist').split('·')]),
'track': mobj.group('track').strip(), 'track': mobj.group('track').strip(),
'release_date': release_date, 'release_date': release_date,
'release_year': int_or_none(release_year), 'release_year': int_or_none(release_year),
@ -4533,7 +4532,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if mrr_title == 'Album': if mrr_title == 'Album':
info['album'] = mrr_contents_text info['album'] = mrr_contents_text
elif mrr_title == 'Artist': elif mrr_title == 'Artist':
info['artists'] = [mrr_contents_text] info['artist'] = mrr_contents_text
elif mrr_title == 'Song': elif mrr_title == 'Song':
info['track'] = mrr_contents_text info['track'] = mrr_contents_text
owner_badges = self._extract_badges(traverse_obj(vsir, ('owner', 'videoOwnerRenderer', 'badges'))) owner_badges = self._extract_badges(traverse_obj(vsir, ('owner', 'videoOwnerRenderer', 'badges')))
@ -4567,7 +4566,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if fmt.get('protocol') == 'm3u8_native': if fmt.get('protocol') == 'm3u8_native':
fmt['__needs_testing'] = True fmt['__needs_testing'] = True
for s_k, d_k in [('artists', 'creators'), ('track', 'alt_title')]: for s_k, d_k in [('artist', 'creator'), ('track', 'alt_title')]:
v = info.get(s_k) v = info.get(s_k)
if v: if v:
info[d_k] = v info[d_k] = v

View File

@ -738,10 +738,9 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
def add(meta_list, info_list=None): def add(meta_list, info_list=None):
value = next(( value = next((
info[key] for key in [f'{meta_prefix}_'] + list(variadic(info_list or meta_list)) str(info[key]) for key in [f'{meta_prefix}_'] + list(variadic(info_list or meta_list))
if info.get(key) is not None), None) if info.get(key) is not None), None)
if value not in ('', None): if value not in ('', None):
value = ', '.join(map(str, variadic(value)))
value = value.replace('\0', '') # nul character cannot be passed in command line value = value.replace('\0', '') # nul character cannot be passed in command line
metadata['common'].update({meta_f: value for meta_f in variadic(meta_list)}) metadata['common'].update({meta_f: value for meta_f in variadic(meta_list)})
@ -755,11 +754,10 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
add(('description', 'synopsis'), 'description') add(('description', 'synopsis'), 'description')
add(('purl', 'comment'), 'webpage_url') add(('purl', 'comment'), 'webpage_url')
add('track', 'track_number') add('track', 'track_number')
add('artist', ('artist', 'artists', 'creator', 'creators', 'uploader', 'uploader_id')) add('artist', ('artist', 'creator', 'uploader', 'uploader_id'))
add('composer', ('composer', 'composers')) add('genre')
add('genre', ('genre', 'genres'))
add('album') add('album')
add('album_artist', ('album_artist', 'album_artists')) add('album_artist')
add('disc', 'disc_number') add('disc', 'disc_number')
add('show', 'series') add('show', 'series')
add('season_number') add('season_number')

View File

@ -3,7 +3,6 @@ import contextlib
import inspect import inspect
import itertools import itertools
import re import re
import xml.etree.ElementTree
from ._utils import ( from ._utils import (
IDENTITY, IDENTITY,
@ -119,7 +118,7 @@ def traverse_obj(
branching = True branching = True
if isinstance(obj, collections.abc.Mapping): if isinstance(obj, collections.abc.Mapping):
result = obj.values() result = obj.values()
elif is_iterable_like(obj) or isinstance(obj, xml.etree.ElementTree.Element): elif is_iterable_like(obj):
result = obj result = obj
elif isinstance(obj, re.Match): elif isinstance(obj, re.Match):
result = obj.groups() result = obj.groups()
@ -133,7 +132,7 @@ def traverse_obj(
branching = True branching = True
if isinstance(obj, collections.abc.Mapping): if isinstance(obj, collections.abc.Mapping):
iter_obj = obj.items() iter_obj = obj.items()
elif is_iterable_like(obj) or isinstance(obj, xml.etree.ElementTree.Element): elif is_iterable_like(obj):
iter_obj = enumerate(obj) iter_obj = enumerate(obj)
elif isinstance(obj, re.Match): elif isinstance(obj, re.Match):
iter_obj = itertools.chain( iter_obj = itertools.chain(
@ -169,7 +168,7 @@ def traverse_obj(
result = next((v for k, v in obj.groupdict().items() if casefold(k) == key), None) result = next((v for k, v in obj.groupdict().items() if casefold(k) == key), None)
elif isinstance(key, (int, slice)): elif isinstance(key, (int, slice)):
if is_iterable_like(obj, (collections.abc.Sequence, xml.etree.ElementTree.Element)): if is_iterable_like(obj, collections.abc.Sequence):
branching = isinstance(key, slice) branching = isinstance(key, slice)
with contextlib.suppress(IndexError): with contextlib.suppress(IndexError):
result = obj[key] result = obj[key]
@ -177,34 +176,6 @@ def traverse_obj(
with contextlib.suppress(IndexError): with contextlib.suppress(IndexError):
result = str(obj)[key] result = str(obj)[key]
elif isinstance(obj, xml.etree.ElementTree.Element) and isinstance(key, str):
xpath, _, special = key.rpartition('/')
if not special.startswith('@') and special != 'text()':
xpath = key
special = None
# Allow abbreviations of relative paths, absolute paths error
if xpath.startswith('/'):
xpath = f'.{xpath}'
elif xpath and not xpath.startswith('./'):
xpath = f'./{xpath}'
def apply_specials(element):
if special is None:
return element
if special == '@':
return element.attrib
if special.startswith('@'):
return try_call(element.attrib.get, args=(special[1:],))
if special == 'text()':
return element.text
assert False, f'apply_specials is missing case for {special!r}'
if xpath:
result = list(map(apply_specials, obj.iterfind(xpath)))
else:
result = apply_specials(obj)
return branching, result if branching else (result,) return branching, result if branching else (result,)
def lazy_last(iterable): def lazy_last(iterable):