Compare commits

..

39 Commits

Author SHA1 Message Date
sepro
6c3cfe7340 Change more extractors to list form 2024-01-12 23:12:28 +01:00
sepro
2858814ab6 Use creators in web archive 2024-01-12 22:34:58 +01:00
sepro
050e6eb521 Revert adding composer to readme
Was already added in other PR
2024-01-12 22:22:18 +01:00
sepro
585ea55ac0 Merge branch 'pr/8917' into cleanup 2024-01-12 21:51:45 +01:00
sepro
0b6914d871 Merge branch 'master' into cleanup 2024-01-12 21:49:30 +01:00
pukkandan
b40e1e76bd
oops 2024-01-12 23:28:25 +05:30
pukkandan
6aa45a9a69
More robust warning 2024-01-12 22:40:10 +05:30
pukkandan
5ced986ab7
Clean docs 2024-01-12 22:38:58 +05:30
pukkandan
7f3a69ae68
[ie/youtube] Migrate artist 2024-01-12 22:08:11 +05:30
pukkandan
af8e0c8e8b
Replace comma with unicode 2024-01-12 22:08:11 +05:30
pukkandan
75a6541ad2
[test] Test only new fields 2024-01-12 22:07:50 +05:30
pukkandan
694da355d2
Handle when both fields are returned 2024-01-12 21:18:27 +05:30
pukkandan
1531f4f69e
Stricter Splitting 2024-01-12 20:09:40 +05:30
pukkandan
9e76a7ecbc
typo 2024-01-12 06:44:28 +05:30
pukkandan
916acca08f
Add creators
Ref: https://github.com/yt-dlp/yt-dlp/pull/8906#r1440046129
2024-01-12 06:32:53 +05:30
pukkandan
b81745716e
Cleanup 2024-01-12 06:26:30 +05:30
pukkandan
afccd2d730
We weren't able to deprecate 2024-01-12 06:07:39 +05:30
pukkandan
5bed30d642
Future-proof 2024-01-12 06:07:05 +05:30
Max
95e82347b3
[ie/Viously] Add extractor (#8927)
Replaces Turbo extractor

Authored by: nbr23, seproDev

Co-authored-by: sepro <4618135+seproDev@users.noreply.github.com>
2024-01-09 04:11:52 +01:00
DmitryScaletta
5b8c69ae04
[ie/twitch] Fix m3u8 extraction (#8960)
Closes #8958
Authored by: DmitryScaletta
2024-01-09 02:47:13 +00:00
garret
5af1f19787
[ie/NhkRadiruLive] Make metadata extraction non-fatal (#8956)
Authored by: garret1317
2024-01-08 17:59:44 +00:00
Lev Plyusnin
482a971bc2
Fix linting 2024-01-08 21:29:26 +07:00
Lev Plyusnin
4bfd8ed511
Update README to reflect changes in FFMpegMetadataPP 2024-01-08 21:16:28 +07:00
Lev Plyusnin
84c89c3c7b
Better backward compatibility 2024-01-08 21:06:40 +07:00
Simon Sawicki
b6951271ac
[ie/ard:mediathek] Revert to using old id (#8916)
Authored by: Grub4K
2024-01-05 21:34:38 +01:00
Simon Sawicki
ffbd4f2a02
[utils] traverse_obj: Support xml.etree.ElementTree.Element (#8911)
Authored by: Grub4K
2024-01-05 21:26:17 +01:00
mara004
292d60b1ed
[cleanup] Fix typo in README.md (#8894)
Authored by: antonkesy
2024-01-05 18:13:46 +01:00
Lev Plyusnin
2598790093
Revert MutagenMetadataPP 2024-01-03 15:16:10 +07:00
Lev Plyusnin
dca6384283
Update README and fix IE documentation typo 2024-01-03 15:05:08 +07:00
Lev Plyusnin
c6246594cf
Update README 2024-01-03 14:55:47 +07:00
Lev Plyusnin
c3fe956e87
Revert unrelated change 2024-01-03 14:52:19 +07:00
Lev Plyusnin
41c3dab547
Revert unrelated changes 2024-01-03 14:50:55 +07:00
Lev Plyusnin
265e0f7154
Rename new fields
- Moved fix_deprecated_fields into _fill_common_fields
2024-01-03 14:12:02 +07:00
pukkandan
ac52bf0952
Update yt_dlp/YoutubeDL.py 2024-01-03 09:16:21 +05:30
pukkandan
d60ad19944
Update yt_dlp/extractor/common.py 2024-01-03 09:10:53 +05:30
pukkandan
a691696290
Apply suggestions from code review 2024-01-03 09:10:07 +05:30
pukkandan
698199b0e8
Apply suggestions from code review 2024-01-03 09:07:28 +05:30
Lev Plyusnin
071326c0cc
[ie] Add new fields with proper support for multiple values 2024-01-03 08:35:28 +07:00
Ralph Drake
85b33f5c16
[cookies] Fix --cookies-from-browser with macOS Firefox profiles (#8909)
Ref: https://support.mozilla.org/en-US/kb/profile-manager-create-remove-switch-firefox-profiles#firefox:mac

Closes #8898
Authored by: RalphORama
2024-01-02 00:58:36 +00:00
21 changed files with 291 additions and 175 deletions

View File

@ -280,7 +280,7 @@ While all the other dependencies are optional, `ffmpeg` and `ffprobe` are highly
* [**ffmpeg** and **ffprobe**](https://www.ffmpeg.org) - Required for [merging separate video and audio files](#format-selection) as well as for various [post-processing](#post-processing-options) tasks. License [depends on the build](https://www.ffmpeg.org/legal.html) * [**ffmpeg** and **ffprobe**](https://www.ffmpeg.org) - Required for [merging separate video and audio files](#format-selection) as well as for various [post-processing](#post-processing-options) tasks. License [depends on the build](https://www.ffmpeg.org/legal.html)
There are bugs in ffmpeg that causes various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for some of these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds There are bugs in ffmpeg that cause various issues when used alongside yt-dlp. Since ffmpeg is such an important dependency, we provide [custom builds](https://github.com/yt-dlp/FFmpeg-Builds#ffmpeg-static-auto-builds) with patches for some of these issues at [yt-dlp/FFmpeg-Builds](https://github.com/yt-dlp/FFmpeg-Builds). See [the readme](https://github.com/yt-dlp/FFmpeg-Builds#patches-applied) for details on the specific issues solved by these builds
**Important**: What you need is ffmpeg *binary*, **NOT** [the python package of the same name](https://pypi.org/project/ffmpeg) **Important**: What you need is ffmpeg *binary*, **NOT** [the python package of the same name](https://pypi.org/project/ffmpeg)
@ -1307,7 +1307,8 @@ The available fields are:
- `uploader_id` (string): Nickname or id of the video uploader - `uploader_id` (string): Nickname or id of the video uploader
- `uploader_url` (string): URL to the video uploader's profile - `uploader_url` (string): URL to the video uploader's profile
- `license` (string): License name the video is licensed under - `license` (string): License name the video is licensed under
- `creator` (string): The creator of the video - `creators` (list): The creators of the video
- `creator` (string): The creators of the video; comma-separated
- `timestamp` (numeric): UNIX timestamp of the moment the video became available - `timestamp` (numeric): UNIX timestamp of the moment the video became available
- `upload_date` (string): Video upload date in UTC (YYYYMMDD) - `upload_date` (string): Video upload date in UTC (YYYYMMDD)
- `release_timestamp` (numeric): UNIX timestamp of the moment the video was released - `release_timestamp` (numeric): UNIX timestamp of the moment the video was released
@ -1385,13 +1386,17 @@ Available for the media that is a track or a part of a music album:
- `track` (string): Title of the track - `track` (string): Title of the track
- `track_number` (numeric): Number of the track within an album or a disc - `track_number` (numeric): Number of the track within an album or a disc
- `track_id` (string): Id of the track - `track_id` (string): Id of the track
- `artist` (string): Artist(s) of the track - `artists` (list): Artist(s) of the track
- `genre` (string): Genre(s) of the track - `artist` (string): Artist(s) of the track; comma-separated
- `genres` (list): Genre(s) of the track
- `genre` (string): Genre(s) of the track; comma-separated
- `composers` (list): Composer(s) of the piece
- `composer` (string): Composer(s) of the piece; comma-separated
- `album` (string): Title of the album the track belongs to - `album` (string): Title of the album the track belongs to
- `album_type` (string): Type of the album - `album_type` (string): Type of the album
- `album_artist` (string): List of all artists appeared on the album - `album_artists` (list): All artists appeared on the album
- `album_artist` (string): All artists appeared on the album; comma-separated
- `disc_number` (numeric): Number of the disc or other physical medium the track belongs to - `disc_number` (numeric): Number of the disc or other physical medium the track belongs to
- `composer` (string): Name of the composer
Available only when using `--download-sections` and for `chapter:` prefix when using `--split-chapters` for videos with internal chapters: Available only when using `--download-sections` and for `chapter:` prefix when using `--split-chapters` for videos with internal chapters:
@ -1768,10 +1773,11 @@ Metadata fields | From
`description`, `synopsis` | `description` `description`, `synopsis` | `description`
`purl`, `comment` | `webpage_url` `purl`, `comment` | `webpage_url`
`track` | `track_number` `track` | `track_number`
`artist` | `artist`, `creator`, `uploader` or `uploader_id` `artist` | `artist`, `artists`, `creator`, `creators`, `uploader` or `uploader_id`
`genre` | `genre` `composer` | `composer` or `composers`
`genre` | `genre` or `genres`
`album` | `album` `album` | `album`
`album_artist` | `album_artist` `album_artist` | `album_artist` or `album_artists`
`disc` | `disc_number` `disc` | `disc_number`
`show` | `series` `show` | `series`
`season_number` | `season_number` `season_number` | `season_number`

View File

@ -223,6 +223,10 @@ def sanitize_got_info_dict(got_dict):
if test_info_dict.get('display_id') == test_info_dict.get('id'): if test_info_dict.get('display_id') == test_info_dict.get('id'):
test_info_dict.pop('display_id') test_info_dict.pop('display_id')
# Remove deprecated fields
for old in YoutubeDL._deprecated_multivalue_fields.keys():
test_info_dict.pop(old, None)
# release_year may be generated from release_date # release_year may be generated from release_date
if try_call(lambda: test_info_dict['release_year'] == int(test_info_dict['release_date'][:4])): if try_call(lambda: test_info_dict['release_year'] == int(test_info_dict['release_date'][:4])):
test_info_dict.pop('release_year') test_info_dict.pop('release_year')

View File

@ -941,7 +941,7 @@ class TestYoutubeDL(unittest.TestCase):
def get_videos(filter_=None): def get_videos(filter_=None):
ydl = YDL({'match_filter': filter_, 'simulate': True}) ydl = YDL({'match_filter': filter_, 'simulate': True})
for v in videos: for v in videos:
ydl.process_ie_result(v, download=True) ydl.process_ie_result(v.copy(), download=True)
return [v['id'] for v in ydl.downloaded_info_dicts] return [v['id'] for v in ydl.downloaded_info_dicts]
res = get_videos() res = get_videos()

View File

@ -2340,6 +2340,58 @@ Line 1
self.assertEqual(traverse_obj(mobj, lambda k, _: k in (0, 'group')), ['0123', '3'], self.assertEqual(traverse_obj(mobj, lambda k, _: k in (0, 'group')), ['0123', '3'],
msg='function on a `re.Match` should give group name as well') msg='function on a `re.Match` should give group name as well')
# Test xml.etree.ElementTree.Element as input obj
etree = xml.etree.ElementTree.fromstring('''<?xml version="1.0"?>
<data>
<country name="Liechtenstein">
<rank>1</rank>
<year>2008</year>
<gdppc>141100</gdppc>
<neighbor name="Austria" direction="E"/>
<neighbor name="Switzerland" direction="W"/>
</country>
<country name="Singapore">
<rank>4</rank>
<year>2011</year>
<gdppc>59900</gdppc>
<neighbor name="Malaysia" direction="N"/>
</country>
<country name="Panama">
<rank>68</rank>
<year>2011</year>
<gdppc>13600</gdppc>
<neighbor name="Costa Rica" direction="W"/>
<neighbor name="Colombia" direction="E"/>
</country>
</data>''')
self.assertEqual(traverse_obj(etree, ''), etree,
msg='empty str key should return the element itself')
self.assertEqual(traverse_obj(etree, 'country'), list(etree),
msg='str key should lead all children with that tag name')
self.assertEqual(traverse_obj(etree, ...), list(etree),
msg='`...` as key should return all children')
self.assertEqual(traverse_obj(etree, lambda _, x: x[0].text == '4'), [etree[1]],
msg='function as key should get element as value')
self.assertEqual(traverse_obj(etree, lambda i, _: i == 1), [etree[1]],
msg='function as key should get index as key')
self.assertEqual(traverse_obj(etree, 0), etree[0],
msg='int key should return the nth child')
self.assertEqual(traverse_obj(etree, './/neighbor/@name'),
['Austria', 'Switzerland', 'Malaysia', 'Costa Rica', 'Colombia'],
msg='`@<attribute>` at end of path should give that attribute')
self.assertEqual(traverse_obj(etree, '//neighbor/@fail'), [None, None, None, None, None],
msg='`@<nonexistant>` at end of path should give `None`')
self.assertEqual(traverse_obj(etree, ('//neighbor/@', 2)), {'name': 'Malaysia', 'direction': 'N'},
msg='`@` should give the full attribute dict')
self.assertEqual(traverse_obj(etree, '//year/text()'), ['2008', '2011', '2011'],
msg='`text()` at end of path should give the inner text')
self.assertEqual(traverse_obj(etree, '//*[@direction]/@direction'), ['E', 'W', 'N', 'W', 'E'],
msg='full python xpath features should be supported')
self.assertEqual(traverse_obj(etree, (0, '@name')), 'Liechtenstein',
msg='special transformations should act on current element')
self.assertEqual(traverse_obj(etree, ('country', 0, ..., 'text()', {int_or_none})), [1, 2008, 141100],
msg='special transformations should act on current element')
def test_http_header_dict(self): def test_http_header_dict(self):
headers = HTTPHeaderDict() headers = HTTPHeaderDict()
headers['ytdl-test'] = b'0' headers['ytdl-test'] = b'0'

View File

@ -581,6 +581,13 @@ class YoutubeDL:
'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'hls_aes', 'downloader_options', 'http_headers', 'stretched_ratio', 'no_resume', 'has_drm', 'extra_param_to_segment_url', 'hls_aes', 'downloader_options',
'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time' 'page_url', 'app', 'play_path', 'tc_url', 'flash_version', 'rtmp_live', 'rtmp_conn', 'rtmp_protocol', 'rtmp_real_time'
} }
_deprecated_multivalue_fields = {
'album_artist': 'album_artists',
'artist': 'artists',
'composer': 'composers',
'creator': 'creators',
'genre': 'genres',
}
_format_selection_exts = { _format_selection_exts = {
'audio': set(MEDIA_EXTENSIONS.common_audio), 'audio': set(MEDIA_EXTENSIONS.common_audio),
'video': set(MEDIA_EXTENSIONS.common_video + ('3gp', )), 'video': set(MEDIA_EXTENSIONS.common_video + ('3gp', )),
@ -2641,6 +2648,14 @@ class YoutubeDL:
if final and info_dict.get('%s_number' % field) is not None and not info_dict.get(field): if final and info_dict.get('%s_number' % field) is not None and not info_dict.get(field):
info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field]) info_dict[field] = '%s %d' % (field.capitalize(), info_dict['%s_number' % field])
for old_key, new_key in self._deprecated_multivalue_fields.items():
if new_key in info_dict and old_key in info_dict:
self.deprecation_warning(f'Do not return {old_key!r} when {new_key!r} is present')
elif old_value := info_dict.get(old_key):
info_dict[new_key] = old_value.split(', ')
elif new_value := info_dict.get(new_key):
info_dict[old_key] = ', '.join(v.replace(',', '\N{FULLWIDTH COMMA}') for v in new_value)
def _raise_pending_errors(self, info): def _raise_pending_errors(self, info):
err = info.pop('__pending_error', None) err = info.pop('__pending_error', None)
if err: if err:

View File

@ -186,7 +186,7 @@ def _firefox_browser_dir():
if sys.platform in ('cygwin', 'win32'): if sys.platform in ('cygwin', 'win32'):
return os.path.expandvars(R'%APPDATA%\Mozilla\Firefox\Profiles') return os.path.expandvars(R'%APPDATA%\Mozilla\Firefox\Profiles')
elif sys.platform == 'darwin': elif sys.platform == 'darwin':
return os.path.expanduser('~/Library/Application Support/Firefox') return os.path.expanduser('~/Library/Application Support/Firefox/Profiles')
return os.path.expanduser('~/.mozilla/firefox') return os.path.expanduser('~/.mozilla/firefox')

View File

@ -2019,7 +2019,6 @@ from .tunein import (
TuneInPodcastEpisodeIE, TuneInPodcastEpisodeIE,
TuneInShortenerIE, TuneInShortenerIE,
) )
from .turbo import TurboIE
from .tv2 import ( from .tv2 import (
TV2IE, TV2IE,
TV2ArticleIE, TV2ArticleIE,
@ -2223,6 +2222,7 @@ from .viki import (
VikiIE, VikiIE,
VikiChannelIE, VikiChannelIE,
) )
from .viously import ViouslyIE
from .viqeo import ViqeoIE from .viqeo import ViqeoIE
from .viu import ( from .viu import (
ViuIE, ViuIE,

View File

@ -31,6 +31,7 @@ from ..utils import (
unified_timestamp, unified_timestamp,
url_or_none, url_or_none,
urlhandle_detect_ext, urlhandle_detect_ext,
variadic,
) )
@ -49,7 +50,7 @@ class ArchiveOrgIE(InfoExtractor):
'release_date': '19681210', 'release_date': '19681210',
'timestamp': 1268695290, 'timestamp': 1268695290,
'upload_date': '20100315', 'upload_date': '20100315',
'creator': 'SRI International', 'creators': ['SRI International'],
'uploader': 'laura@archive.org', 'uploader': 'laura@archive.org',
'thumbnail': r're:https://archive\.org/download/.*\.jpg', 'thumbnail': r're:https://archive\.org/download/.*\.jpg',
'display_id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.cdr', 'display_id': 'XD300-23_68HighlightsAResearchCntAugHumanIntellect.cdr',
@ -109,7 +110,7 @@ class ArchiveOrgIE(InfoExtractor):
'title': 'Turning', 'title': 'Turning',
'ext': 'flac', 'ext': 'flac',
'track': 'Turning', 'track': 'Turning',
'creator': 'Grateful Dead', 'creators': ['Grateful Dead'],
'display_id': 'gd1977-05-08d01t01.flac', 'display_id': 'gd1977-05-08d01t01.flac',
'track_number': 1, 'track_number': 1,
'album': '1977-05-08 - Barton Hall - Cornell University', 'album': '1977-05-08 - Barton Hall - Cornell University',
@ -129,7 +130,7 @@ class ArchiveOrgIE(InfoExtractor):
'location': 'Barton Hall - Cornell University', 'location': 'Barton Hall - Cornell University',
'duration': 438.68, 'duration': 438.68,
'track': 'Deal', 'track': 'Deal',
'creator': 'Grateful Dead', 'creators': ['Grateful Dead'],
'album': '1977-05-08 - Barton Hall - Cornell University', 'album': '1977-05-08 - Barton Hall - Cornell University',
'release_date': '19770508', 'release_date': '19770508',
'display_id': 'gd1977-05-08d01t07.flac', 'display_id': 'gd1977-05-08d01t07.flac',
@ -167,7 +168,7 @@ class ArchiveOrgIE(InfoExtractor):
'upload_date': '20160610', 'upload_date': '20160610',
'description': 'md5:f70956a156645a658a0dc9513d9e78b7', 'description': 'md5:f70956a156645a658a0dc9513d9e78b7',
'uploader': 'dimitrios@archive.org', 'uploader': 'dimitrios@archive.org',
'creator': 'British Broadcasting Corporation, Time-Life Films', 'creators': ['British Broadcasting Corporation', 'Time-Life Films'],
'timestamp': 1465594947, 'timestamp': 1465594947,
}, },
'playlist': [ 'playlist': [
@ -257,8 +258,7 @@ class ArchiveOrgIE(InfoExtractor):
'title': m['title'], 'title': m['title'],
'description': clean_html(m.get('description')), 'description': clean_html(m.get('description')),
'uploader': dict_get(m, ['uploader', 'adder']), 'uploader': dict_get(m, ['uploader', 'adder']),
'creator': traverse_obj(m, ( 'creators': traverse_obj(m, ('creator', {variadic}, {lambda x: x[0] and list(x)})),
'creator', (({list}, {lambda x: join_nonempty(*x, delim=', ')}), {str})), get_all=False) or None,
'license': m.get('licenseurl'), 'license': m.get('licenseurl'),
'release_date': unified_strdate(m.get('date')), 'release_date': unified_strdate(m.get('date')),
'timestamp': unified_timestamp(dict_get(m, ['publicdate', 'addeddate'])), 'timestamp': unified_timestamp(dict_get(m, ['publicdate', 'addeddate'])),
@ -273,8 +273,7 @@ class ArchiveOrgIE(InfoExtractor):
'title': f.get('title') or f['name'], 'title': f.get('title') or f['name'],
'display_id': f['name'], 'display_id': f['name'],
'description': clean_html(f.get('description')), 'description': clean_html(f.get('description')),
'creator': traverse_obj(f, ( 'creators': traverse_obj(f, ('creator', {variadic}, {lambda x: x[0] and list(x)})),
'creator', (({list}, {lambda x: join_nonempty(*x, delim=', ')}), {str})), get_all=False) or None,
'duration': parse_duration(f.get('length')), 'duration': parse_duration(f.get('length')),
'track_number': int_or_none(f.get('track')), 'track_number': int_or_none(f.get('track')),
'album': f.get('album'), 'album': f.get('album'),

View File

@ -4,6 +4,7 @@ from functools import partial
from .common import InfoExtractor from .common import InfoExtractor
from ..utils import ( from ..utils import (
OnDemandPagedList, OnDemandPagedList,
bug_reports_message,
determine_ext, determine_ext,
int_or_none, int_or_none,
join_nonempty, join_nonempty,
@ -233,7 +234,7 @@ class ARDBetaMediathekIE(InfoExtractor):
(?:(?:beta|www)\.)?ardmediathek\.de/ (?:(?:beta|www)\.)?ardmediathek\.de/
(?:[^/]+/)? (?:[^/]+/)?
(?:player|live|video)/ (?:player|live|video)/
(?:(?P<display_id>[^?#]+)/)? (?:[^?#]+/)?
(?P<id>[a-zA-Z0-9]+) (?P<id>[a-zA-Z0-9]+)
/?(?:[?#]|$)''' /?(?:[?#]|$)'''
_GEO_COUNTRIES = ['DE'] _GEO_COUNTRIES = ['DE']
@ -242,8 +243,8 @@ class ARDBetaMediathekIE(InfoExtractor):
'url': 'https://www.ardmediathek.de/video/filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen/Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0', 'url': 'https://www.ardmediathek.de/video/filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen/Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0',
'md5': 'b6e8ab03f2bcc6e1f9e6cef25fcc03c4', 'md5': 'b6e8ab03f2bcc6e1f9e6cef25fcc03c4',
'info_dict': { 'info_dict': {
'display_id': 'filme-im-mdr/liebe-auf-vier-pfoten/mdr-fernsehen', 'display_id': 'Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0',
'id': 'Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0', 'id': '12939099',
'title': 'Liebe auf vier Pfoten', 'title': 'Liebe auf vier Pfoten',
'description': r're:^Claudia Schmitt, Anwältin in Salzburg', 'description': r're:^Claudia Schmitt, Anwältin in Salzburg',
'duration': 5222, 'duration': 5222,
@ -255,7 +256,7 @@ class ARDBetaMediathekIE(InfoExtractor):
'series': 'Filme im MDR', 'series': 'Filme im MDR',
'age_limit': 0, 'age_limit': 0,
'channel': 'MDR', 'channel': 'MDR',
'_old_archive_ids': ['ardbetamediathek 12939099'], '_old_archive_ids': ['ardbetamediathek Y3JpZDovL21kci5kZS9zZW5kdW5nLzI4MjA0MC80MjIwOTEtNDAyNTM0'],
}, },
}, { }, {
'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/', 'url': 'https://www.ardmediathek.de/mdr/video/die-robuste-roswita/Y3JpZDovL21kci5kZS9iZWl0cmFnL2Ntcy84MWMxN2MzZC0wMjkxLTRmMzUtODk4ZS0wYzhlOWQxODE2NGI/',
@ -276,37 +277,37 @@ class ARDBetaMediathekIE(InfoExtractor):
'url': 'https://www.ardmediathek.de/video/tagesschau-oder-tagesschau-20-00-uhr/das-erste/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll', 'url': 'https://www.ardmediathek.de/video/tagesschau-oder-tagesschau-20-00-uhr/das-erste/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll',
'md5': '1e73ded21cb79bac065117e80c81dc88', 'md5': '1e73ded21cb79bac065117e80c81dc88',
'info_dict': { 'info_dict': {
'id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll', 'id': '10049223',
'ext': 'mp4', 'ext': 'mp4',
'title': 'tagesschau, 20:00 Uhr', 'title': 'tagesschau, 20:00 Uhr',
'timestamp': 1636398000, 'timestamp': 1636398000,
'description': 'md5:39578c7b96c9fe50afdf5674ad985e6b', 'description': 'md5:39578c7b96c9fe50afdf5674ad985e6b',
'upload_date': '20211108', 'upload_date': '20211108',
'display_id': 'tagesschau-oder-tagesschau-20-00-uhr/das-erste', 'display_id': 'Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll',
'duration': 915, 'duration': 915,
'episode': 'tagesschau, 20:00 Uhr', 'episode': 'tagesschau, 20:00 Uhr',
'series': 'tagesschau', 'series': 'tagesschau',
'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:fbb21142783b0a49?w=960&ch=ee69108ae344f678', 'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:fbb21142783b0a49?w=960&ch=ee69108ae344f678',
'channel': 'ARD-Aktuell', 'channel': 'ARD-Aktuell',
'_old_archive_ids': ['ardbetamediathek 10049223'], '_old_archive_ids': ['ardbetamediathek Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhZ2Vzc2NoYXUvZmM4ZDUxMjgtOTE0ZC00Y2MzLTgzNzAtNDZkNGNiZWJkOTll'],
}, },
}, { }, {
'url': 'https://www.ardmediathek.de/video/7-tage/7-tage-unter-harten-jungs/hr-fernsehen/N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3', 'url': 'https://www.ardmediathek.de/video/7-tage/7-tage-unter-harten-jungs/hr-fernsehen/N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3',
'md5': 'c428b9effff18ff624d4f903bda26315', 'md5': 'c428b9effff18ff624d4f903bda26315',
'info_dict': { 'info_dict': {
'id': 'N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3', 'id': '94834686',
'ext': 'mp4', 'ext': 'mp4',
'duration': 2700, 'duration': 2700,
'episode': '7 Tage ... unter harten Jungs', 'episode': '7 Tage ... unter harten Jungs',
'description': 'md5:0f215470dcd2b02f59f4bd10c963f072', 'description': 'md5:0f215470dcd2b02f59f4bd10c963f072',
'upload_date': '20231005', 'upload_date': '20231005',
'timestamp': 1696491171, 'timestamp': 1696491171,
'display_id': '7-tage/7-tage-unter-harten-jungs/hr-fernsehen', 'display_id': 'N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3',
'series': '7 Tage ...', 'series': '7 Tage ...',
'channel': 'HR', 'channel': 'HR',
'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:f6e6d5ffac41925c?w=960&ch=fa32ba69bc87989a', 'thumbnail': 'https://api.ardmediathek.de/image-service/images/urn:ard:image:f6e6d5ffac41925c?w=960&ch=fa32ba69bc87989a',
'title': '7 Tage ... unter harten Jungs', 'title': '7 Tage ... unter harten Jungs',
'_old_archive_ids': ['ardbetamediathek 94834686'], '_old_archive_ids': ['ardbetamediathek N2I2YmM5MzgtNWFlOS00ZGFlLTg2NzMtYzNjM2JlNjk4MDg3'],
}, },
}, { }, {
'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE', 'url': 'https://beta.ardmediathek.de/ard/video/Y3JpZDovL2Rhc2Vyc3RlLmRlL3RhdG9ydC9mYmM4NGM1NC0xNzU4LTRmZGYtYWFhZS0wYzcyZTIxNGEyMDE',
@ -357,14 +358,25 @@ class ARDBetaMediathekIE(InfoExtractor):
}), get_all=False) }), get_all=False)
def _real_extract(self, url): def _real_extract(self, url):
video_id, display_id = self._match_valid_url(url).group('id', 'display_id') display_id = self._match_id(url)
page_data = self._download_json( page_data = self._download_json(
f'https://api.ardmediathek.de/page-gateway/pages/ard/item/{video_id}', video_id, query={ f'https://api.ardmediathek.de/page-gateway/pages/ard/item/{display_id}', display_id, query={
'embedded': 'false', 'embedded': 'false',
'mcV6': 'true', 'mcV6': 'true',
}) })
# For user convenience we use the old contentId instead of the longer crid
# Ref: https://github.com/yt-dlp/yt-dlp/issues/8731#issuecomment-1874398283
old_id = traverse_obj(page_data, ('tracking', 'atiCustomVars', 'contentId', {int}))
if old_id is not None:
video_id = str(old_id)
archive_ids = [make_archive_id(ARDBetaMediathekIE, display_id)]
else:
self.report_warning(f'Could not extract contentId{bug_reports_message()}')
video_id = display_id
archive_ids = None
player_data = traverse_obj( player_data = traverse_obj(
page_data, ('widgets', lambda _, v: v['type'] in ('player_ondemand', 'player_live'), {dict}), get_all=False) page_data, ('widgets', lambda _, v: v['type'] in ('player_ondemand', 'player_live'), {dict}), get_all=False)
is_live = player_data.get('type') == 'player_live' is_live = player_data.get('type') == 'player_live'
@ -419,8 +431,6 @@ class ARDBetaMediathekIE(InfoExtractor):
}) })
age_limit = traverse_obj(page_data, ('fskRating', {lambda x: remove_start(x, 'FSK')}, {int_or_none})) age_limit = traverse_obj(page_data, ('fskRating', {lambda x: remove_start(x, 'FSK')}, {int_or_none}))
old_id = traverse_obj(page_data, ('tracking', 'atiCustomVars', 'contentId'))
return { return {
'id': video_id, 'id': video_id,
'display_id': display_id, 'display_id': display_id,
@ -438,7 +448,7 @@ class ARDBetaMediathekIE(InfoExtractor):
'channel': 'clipSourceName', 'channel': 'clipSourceName',
})), })),
**self._extract_episode_info(page_data.get('title')), **self._extract_episode_info(page_data.get('title')),
'_old_archive_ids': [make_archive_id(ARDBetaMediathekIE, old_id)], '_old_archive_ids': archive_ids,
} }

View File

@ -278,7 +278,7 @@ class InfoExtractor:
description: Full video description. description: Full video description.
uploader: Full name of the video uploader. uploader: Full name of the video uploader.
license: License name the video is licensed under. license: License name the video is licensed under.
creator: The creator of the video. creators: List of creators of the video.
timestamp: UNIX timestamp of the moment the video was uploaded timestamp: UNIX timestamp of the moment the video was uploaded
upload_date: Video upload date in UTC (YYYYMMDD). upload_date: Video upload date in UTC (YYYYMMDD).
If not explicitly set, calculated from timestamp If not explicitly set, calculated from timestamp
@ -422,16 +422,16 @@ class InfoExtractor:
track_number: Number of the track within an album or a disc, as an integer. track_number: Number of the track within an album or a disc, as an integer.
track_id: Id of the track (useful in case of custom indexing, e.g. 6.iii), track_id: Id of the track (useful in case of custom indexing, e.g. 6.iii),
as a unicode string. as a unicode string.
artist: Artist(s) of the track. artists: List of artists of the track.
genre: Genre(s) of the track. composers: List of composers of the piece.
genres: List of genres of the track.
album: Title of the album the track belongs to. album: Title of the album the track belongs to.
album_type: Type of the album (e.g. "Demo", "Full-length", "Split", "Compilation", etc). album_type: Type of the album (e.g. "Demo", "Full-length", "Split", "Compilation", etc).
album_artist: List of all artists appeared on the album (e.g. album_artists: List of all artists appeared on the album.
"Ash Borer / Fell Voices" or "Various Artists", useful for splits E.g. ["Ash Borer", "Fell Voices"] or ["Various Artists"].
and compilations). Useful for splits and compilations.
disc_number: Number of the disc or other physical medium the track belongs to, disc_number: Number of the disc or other physical medium the track belongs to,
as an integer. as an integer.
composer: Composer of the piece
The following fields should only be set for clips that should be cut from the original video: The following fields should only be set for clips that should be cut from the original video:
@ -442,6 +442,18 @@ class InfoExtractor:
rows: Number of rows in each storyboard fragment, as an integer rows: Number of rows in each storyboard fragment, as an integer
columns: Number of columns in each storyboard fragment, as an integer columns: Number of columns in each storyboard fragment, as an integer
The following fields are deprecated and should not be set by new code:
composer: Use "composers" instead.
Composer(s) of the piece, comma-separated.
artist: Use "artists" instead.
Artist(s) of the track, comma-separated.
genre: Use "genres" instead.
Genre(s) of the track, comma-separated.
album_artist: Use "album_artists" instead.
All artists appeared on the album, comma-separated.
creator: Use "creators" instead.
The creator of the video.
Unless mentioned otherwise, the fields should be Unicode strings. Unless mentioned otherwise, the fields should be Unicode strings.
Unless mentioned otherwise, None is equivalent to absence of information. Unless mentioned otherwise, None is equivalent to absence of information.

View File

@ -514,7 +514,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'track': 'Egaono Hana', 'track': 'Egaono Hana',
'artist': 'Goose house', 'artist': 'Goose house',
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$', 'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'genre': ['J-Pop'], 'genres': ['J-Pop'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@ -527,7 +527,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'track': 'Crossing Field', 'track': 'Crossing Field',
'artist': 'LiSA', 'artist': 'LiSA',
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$', 'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'genre': ['Anime'], 'genres': ['Anime'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@ -541,7 +541,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'artist': 'LiSA', 'artist': 'LiSA',
'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$', 'thumbnail': r're:(?i)^https://www.crunchyroll.com/imgsrv/.*\.jpeg?$',
'description': 'md5:747444e7e6300907b7a43f0a0503072e', 'description': 'md5:747444e7e6300907b7a43f0a0503072e',
'genre': ['J-Pop'], 'genres': ['J-Pop'],
}, },
'params': {'skip_download': 'm3u8'}, 'params': {'skip_download': 'm3u8'},
}, { }, {
@ -594,7 +594,7 @@ class CrunchyrollMusicIE(CrunchyrollBaseIE):
'width': ('width', {int_or_none}), 'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}), 'height': ('height', {int_or_none}),
}), }),
'genre': ('genres', ..., 'displayValue'), 'genres': ('genres', ..., 'displayValue'),
'age_limit': ('maturity_ratings', -1, {parse_age_limit}), 'age_limit': ('maturity_ratings', -1, {parse_age_limit}),
}), }),
} }
@ -611,7 +611,7 @@ class CrunchyrollArtistIE(CrunchyrollBaseIE):
'info_dict': { 'info_dict': {
'id': 'MA179CB50D', 'id': 'MA179CB50D',
'title': 'LiSA', 'title': 'LiSA',
'genre': ['J-Pop', 'Anime', 'Rock'], 'genres': ['J-Pop', 'Anime', 'Rock'],
'description': 'md5:16d87de61a55c3f7d6c454b73285938e', 'description': 'md5:16d87de61a55c3f7d6c454b73285938e',
}, },
'playlist_mincount': 83, 'playlist_mincount': 83,
@ -645,6 +645,6 @@ class CrunchyrollArtistIE(CrunchyrollBaseIE):
'width': ('width', {int_or_none}), 'width': ('width', {int_or_none}),
'height': ('height', {int_or_none}), 'height': ('height', {int_or_none}),
}), }),
'genre': ('genres', ..., 'displayValue'), 'genres': ('genres', ..., 'displayValue'),
}), }),
} }

View File

@ -9,7 +9,7 @@ class MonsterSirenHypergryphMusicIE(InfoExtractor):
'info_dict': { 'info_dict': {
'id': '514562', 'id': '514562',
'ext': 'wav', 'ext': 'wav',
'artist': ['塞壬唱片-MSR'], 'artists': ['塞壬唱片-MSR'],
'album': 'Flame Shadow', 'album': 'Flame Shadow',
'title': 'Flame Shadow', 'title': 'Flame Shadow',
} }
@ -27,6 +27,6 @@ class MonsterSirenHypergryphMusicIE(InfoExtractor):
'url': traverse_obj(json_data, ('player', 'songDetail', 'sourceUrl')), 'url': traverse_obj(json_data, ('player', 'songDetail', 'sourceUrl')),
'ext': 'wav', 'ext': 'wav',
'vcodec': 'none', 'vcodec': 'none',
'artist': traverse_obj(json_data, ('player', 'songDetail', 'artists')), 'artists': traverse_obj(json_data, ('player', 'songDetail', 'artists', ...)),
'album': traverse_obj(json_data, ('musicPlay', 'albumDetail', 'name')) 'album': traverse_obj(json_data, ('musicPlay', 'albumDetail', 'name'))
} }

View File

@ -17,11 +17,11 @@ class MusicdexBaseIE(InfoExtractor):
'track_number': track_json.get('number'), 'track_number': track_json.get('number'),
'url': format_field(track_json, 'url', 'https://www.musicdex.org/%s'), 'url': format_field(track_json, 'url', 'https://www.musicdex.org/%s'),
'duration': track_json.get('duration'), 'duration': track_json.get('duration'),
'genre': [genre.get('name') for genre in track_json.get('genres') or []], 'genres': [genre.get('name') for genre in track_json.get('genres') or []],
'like_count': track_json.get('likes_count'), 'like_count': track_json.get('likes_count'),
'view_count': track_json.get('plays'), 'view_count': track_json.get('plays'),
'artist': [artist.get('name') for artist in track_json.get('artists') or []], 'artists': [artist.get('name') for artist in track_json.get('artists') or []],
'album_artist': [artist.get('name') for artist in album_json.get('artists') or []], 'album_artists': [artist.get('name') for artist in album_json.get('artists') or []],
'thumbnail': format_field(album_json, 'image', 'https://www.musicdex.org/%s'), 'thumbnail': format_field(album_json, 'image', 'https://www.musicdex.org/%s'),
'album': album_json.get('name'), 'album': album_json.get('name'),
'release_year': try_get(album_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year), 'release_year': try_get(album_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year),
@ -43,11 +43,11 @@ class MusicdexSongIE(MusicdexBaseIE):
'track': 'dual existence', 'track': 'dual existence',
'track_number': 1, 'track_number': 1,
'duration': 266000, 'duration': 266000,
'genre': ['Anime'], 'genres': ['Anime'],
'like_count': int, 'like_count': int,
'view_count': int, 'view_count': int,
'artist': ['fripSide'], 'artists': ['fripSide'],
'album_artist': ['fripSide'], 'album_artists': ['fripSide'],
'thumbnail': 'https://www.musicdex.org/storage/album/9iDIam1DHTVqUG4UclFIEq1WAFGXfPW4y0TtZa91.png', 'thumbnail': 'https://www.musicdex.org/storage/album/9iDIam1DHTVqUG4UclFIEq1WAFGXfPW4y0TtZa91.png',
'album': 'To Aru Kagaku no Railgun T OP2 Single - dual existence', 'album': 'To Aru Kagaku no Railgun T OP2 Single - dual existence',
'release_year': 2020 'release_year': 2020
@ -69,9 +69,9 @@ class MusicdexAlbumIE(MusicdexBaseIE):
'playlist_mincount': 28, 'playlist_mincount': 28,
'info_dict': { 'info_dict': {
'id': '56', 'id': '56',
'genre': ['OST'], 'genres': ['OST'],
'view_count': int, 'view_count': int,
'artist': ['TENMON & Eiichiro Yanagi / minori'], 'artists': ['TENMON & Eiichiro Yanagi / minori'],
'title': 'ef - a tale of memories Original Soundtrack 2 ~fortissimo~', 'title': 'ef - a tale of memories Original Soundtrack 2 ~fortissimo~',
'release_year': 2008, 'release_year': 2008,
'thumbnail': 'https://www.musicdex.org/storage/album/2rSHkyYBYfB7sbvElpEyTMcUn6toY7AohOgJuDlE.jpg', 'thumbnail': 'https://www.musicdex.org/storage/album/2rSHkyYBYfB7sbvElpEyTMcUn6toY7AohOgJuDlE.jpg',
@ -88,9 +88,9 @@ class MusicdexAlbumIE(MusicdexBaseIE):
'id': id, 'id': id,
'title': data_json.get('name'), 'title': data_json.get('name'),
'description': data_json.get('description'), 'description': data_json.get('description'),
'genre': [genre.get('name') for genre in data_json.get('genres') or []], 'genres': [genre.get('name') for genre in data_json.get('genres') or []],
'view_count': data_json.get('plays'), 'view_count': data_json.get('plays'),
'artist': [artist.get('name') for artist in data_json.get('artists') or []], 'artists': [artist.get('name') for artist in data_json.get('artists') or []],
'thumbnail': format_field(data_json, 'image', 'https://www.musicdex.org/%s'), 'thumbnail': format_field(data_json, 'image', 'https://www.musicdex.org/%s'),
'release_year': try_get(data_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year), 'release_year': try_get(data_json, lambda x: date_from_str(unified_strdate(x['release_date'])).year),
'entries': entries, 'entries': entries,

View File

@ -665,7 +665,7 @@ class NhkRadiruLiveIE(InfoExtractor):
noa_info = self._download_json( noa_info = self._download_json(
f'https:{config.find(".//url_program_noa").text}'.format(area=data.find('areakey').text), f'https:{config.find(".//url_program_noa").text}'.format(area=data.find('areakey').text),
station, note=f'Downloading {area} station metadata') station, note=f'Downloading {area} station metadata', fatal=False)
present_info = traverse_obj(noa_info, ('nowonair_list', self._NOA_STATION_IDS.get(station), 'present')) present_info = traverse_obj(noa_info, ('nowonair_list', self._NOA_STATION_IDS.get(station), 'present'))
return { return {

View File

@ -21,7 +21,7 @@ class StagePlusVODConcertIE(InfoExtractor):
'id': 'vod_concert_APNM8GRFDPHMASJKBSPJACG', 'id': 'vod_concert_APNM8GRFDPHMASJKBSPJACG',
'title': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz', 'title': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz',
'description': 'md5:50f78ec180518c9bdb876bac550996fc', 'description': 'md5:50f78ec180518c9bdb876bac550996fc',
'artist': ['Yuja Wang', 'Lorenzo Viotti'], 'artists': ['Yuja Wang', 'Lorenzo Viotti'],
'upload_date': '20230331', 'upload_date': '20230331',
'timestamp': 1680249600, 'timestamp': 1680249600,
'release_date': '20210709', 'release_date': '20210709',
@ -40,10 +40,10 @@ class StagePlusVODConcertIE(InfoExtractor):
'release_timestamp': 1625788800, 'release_timestamp': 1625788800,
'duration': 2207, 'duration': 2207,
'chapters': 'count:5', 'chapters': 'count:5',
'artist': ['Yuja Wang'], 'artists': ['Yuja Wang'],
'composer': ['Sergei Rachmaninoff'], 'composers': ['Sergei Rachmaninoff'],
'album': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz', 'album': 'Yuja Wang plays Rachmaninoff\'s Piano Concerto No. 2 from Odeonsplatz',
'album_artist': ['Yuja Wang', 'Lorenzo Viotti'], 'album_artists': ['Yuja Wang', 'Lorenzo Viotti'],
'track': 'Piano Concerto No. 2 in C Minor, Op. 18', 'track': 'Piano Concerto No. 2 in C Minor, Op. 18',
'track_number': 1, 'track_number': 1,
'genre': 'Instrumental Concerto', 'genre': 'Instrumental Concerto',
@ -474,7 +474,7 @@ fragment BannerFields on Banner {
metadata = traverse_obj(data, { metadata = traverse_obj(data, {
'title': 'title', 'title': 'title',
'description': ('shortDescription', {str}), 'description': ('shortDescription', {str}),
'artist': ('artists', 'edges', ..., 'node', 'name'), 'artists': ('artists', 'edges', ..., 'node', 'name'),
'timestamp': ('archiveReleaseDate', {unified_timestamp}), 'timestamp': ('archiveReleaseDate', {unified_timestamp}),
'release_timestamp': ('productionDate', {unified_timestamp}), 'release_timestamp': ('productionDate', {unified_timestamp}),
}) })
@ -494,7 +494,7 @@ fragment BannerFields on Banner {
'formats': formats, 'formats': formats,
'subtitles': subtitles, 'subtitles': subtitles,
'album': metadata.get('title'), 'album': metadata.get('title'),
'album_artist': metadata.get('artist'), 'album_artists': metadata.get('artist'),
'track_number': idx, 'track_number': idx,
**metadata, **metadata,
**traverse_obj(video, { **traverse_obj(video, {
@ -506,8 +506,8 @@ fragment BannerFields on Banner {
'title': 'title', 'title': 'title',
'start_time': ('mark', {float_or_none}), 'start_time': ('mark', {float_or_none}),
}), }),
'artist': ('artists', 'edges', ..., 'node', 'name'), 'artists': ('artists', 'edges', ..., 'node', 'name'),
'composer': ('work', 'composers', ..., 'name'), 'composers': ('work', 'composers', ..., 'name'),
'genre': ('work', 'genre', 'title'), 'genre': ('work', 'genre', 'title'),
}), }),
}) })

View File

@ -1,64 +0,0 @@
import re
from .common import InfoExtractor
from ..compat import compat_str
from ..utils import (
ExtractorError,
int_or_none,
qualities,
xpath_text,
)
class TurboIE(InfoExtractor):
_VALID_URL = r'https?://(?:www\.)?turbo\.fr/videos-voiture/(?P<id>[0-9]+)-'
_API_URL = 'http://www.turbo.fr/api/tv/xml.php?player_generique=player_generique&id={0:}'
_TEST = {
'url': 'http://www.turbo.fr/videos-voiture/454443-turbo-du-07-09-2014-renault-twingo-3-bentley-continental-gt-speed-ces-guide-achat-dacia.html',
'md5': '33f4b91099b36b5d5a91f84b5bcba600',
'info_dict': {
'id': '454443',
'ext': 'mp4',
'duration': 3715,
'title': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia... ',
'description': 'Turbo du 07/09/2014 : Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
'thumbnail': r're:^https?://.*\.jpg$',
}
}
def _real_extract(self, url):
mobj = self._match_valid_url(url)
video_id = mobj.group('id')
webpage = self._download_webpage(url, video_id)
playlist = self._download_xml(self._API_URL.format(video_id), video_id)
item = playlist.find('./channel/item')
if item is None:
raise ExtractorError('Playlist item was not found', expected=True)
title = xpath_text(item, './title', 'title')
duration = int_or_none(xpath_text(item, './durate', 'duration'))
thumbnail = xpath_text(item, './visuel_clip', 'thumbnail')
description = self._html_search_meta('description', webpage)
formats = []
get_quality = qualities(['3g', 'sd', 'hq'])
for child in item:
m = re.search(r'url_video_(?P<quality>.+)', child.tag)
if m:
quality = compat_str(m.group('quality'))
formats.append({
'format_id': quality,
'url': child.text,
'quality': get_quality(quality),
})
return {
'id': video_id,
'title': title,
'duration': duration,
'thumbnail': thumbnail,
'description': description,
'formats': formats,
}

View File

@ -8,7 +8,6 @@ from .common import InfoExtractor
from ..compat import ( from ..compat import (
compat_parse_qs, compat_parse_qs,
compat_str, compat_str,
compat_urllib_parse_urlencode,
compat_urllib_parse_urlparse, compat_urllib_parse_urlparse,
) )
from ..utils import ( from ..utils import (
@ -191,6 +190,20 @@ class TwitchBaseIE(InfoExtractor):
'url': thumbnail, 'url': thumbnail,
}] if thumbnail else None }] if thumbnail else None
def _extract_twitch_m3u8_formats(self, video_id, token, signature):
"""Subclasses must define _M3U8_PATH"""
return self._extract_m3u8_formats(
f'{self._USHER_BASE}/{self._M3U8_PATH}/{video_id}.m3u8', video_id, 'mp4', query={
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'sig': signature,
'token': token,
})
class TwitchVodIE(TwitchBaseIE): class TwitchVodIE(TwitchBaseIE):
IE_NAME = 'twitch:vod' IE_NAME = 'twitch:vod'
@ -203,6 +216,7 @@ class TwitchVodIE(TwitchBaseIE):
) )
(?P<id>\d+) (?P<id>\d+)
''' '''
_M3U8_PATH = 'vod'
_TESTS = [{ _TESTS = [{
'url': 'http://www.twitch.tv/riotgames/v/6528877?t=5m10s', 'url': 'http://www.twitch.tv/riotgames/v/6528877?t=5m10s',
@ -532,20 +546,8 @@ class TwitchVodIE(TwitchBaseIE):
info = self._extract_info_gql(video, vod_id) info = self._extract_info_gql(video, vod_id)
access_token = self._download_access_token(vod_id, 'video', 'id') access_token = self._download_access_token(vod_id, 'video', 'id')
formats = self._extract_m3u8_formats( formats = self._extract_twitch_m3u8_formats(
'%s/vod/%s.m3u8?%s' % ( vod_id, access_token['value'], access_token['signature'])
self._USHER_BASE, vod_id,
compat_urllib_parse_urlencode({
'allow_source': 'true',
'allow_audio_only': 'true',
'allow_spectre': 'true',
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'nauth': access_token['value'],
'nauthsig': access_token['signature'],
})),
vod_id, 'mp4', entry_protocol='m3u8_native')
formats.extend(self._extract_storyboard(vod_id, video.get('storyboard'), info.get('duration'))) formats.extend(self._extract_storyboard(vod_id, video.get('storyboard'), info.get('duration')))
self._prefer_source(formats) self._prefer_source(formats)
@ -924,6 +926,7 @@ class TwitchStreamIE(TwitchBaseIE):
) )
(?P<id>[^/#?]+) (?P<id>[^/#?]+)
''' '''
_M3U8_PATH = 'api/channel/hls'
_TESTS = [{ _TESTS = [{
'url': 'http://www.twitch.tv/shroomztv', 'url': 'http://www.twitch.tv/shroomztv',
@ -1026,23 +1029,10 @@ class TwitchStreamIE(TwitchBaseIE):
access_token = self._download_access_token( access_token = self._download_access_token(
channel_name, 'stream', 'channelName') channel_name, 'stream', 'channelName')
token = access_token['value']
stream_id = stream.get('id') or channel_name stream_id = stream.get('id') or channel_name
query = { formats = self._extract_twitch_m3u8_formats(
'allow_source': 'true', channel_name, access_token['value'], access_token['signature'])
'allow_audio_only': 'true',
'allow_spectre': 'true',
'p': random.randint(1000000, 10000000),
'player': 'twitchweb',
'playlist_include_framerate': 'true',
'segment_preference': '4',
'sig': access_token['signature'].encode('utf-8'),
'token': token.encode('utf-8'),
}
formats = self._extract_m3u8_formats(
'%s/api/channel/hls/%s.m3u8' % (self._USHER_BASE, channel_name),
stream_id, 'mp4', query=query)
self._prefer_source(formats) self._prefer_source(formats)
view_count = stream.get('viewers') view_count = stream.get('viewers')

View File

@ -0,0 +1,60 @@
import base64
import re
from .common import InfoExtractor
from ..utils import (
extract_attributes,
int_or_none,
parse_iso8601,
)
from ..utils.traversal import traverse_obj
class ViouslyIE(InfoExtractor):
_VALID_URL = False
_WEBPAGE_TESTS = [{
'url': 'http://www.turbo.fr/videos-voiture/454443-turbo-du-07-09-2014-renault-twingo-3-bentley-continental-gt-speed-ces-guide-achat-dacia.html',
'md5': '37a6c3381599381ff53a7e1e0575c0bc',
'info_dict': {
'id': 'F_xQzS2jwb3',
'ext': 'mp4',
'title': 'Turbo du 07/09/2014\xa0: Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
'description': 'Turbo du 07/09/2014\xa0: Renault Twingo 3, Bentley Continental GT Speed, CES, Guide Achat Dacia...',
'age_limit': 0,
'upload_date': '20230328',
'timestamp': 1680037507,
'duration': 3716,
'categories': ['motors'],
}
}]
def _extract_from_webpage(self, url, webpage):
viously_players = re.findall(r'<div[^>]*class="(?:[^"]*\s)?v(?:iou)?sly-player(?:\s[^"]*)?"[^>]*>', webpage)
if not viously_players:
return
def custom_decode(text):
STANDARD_ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/='
CUSTOM_ALPHABET = 'VIOUSLYABCDEFGHJKMNPQRTWXZviouslyabcdefghjkmnpqrtwxz9876543210+/='
data = base64.b64decode(text.translate(str.maketrans(CUSTOM_ALPHABET, STANDARD_ALPHABET)))
return data.decode('utf-8').strip('\x00')
for video_id in traverse_obj(viously_players, (..., {extract_attributes}, 'id')):
formats = self._extract_m3u8_formats(
f'https://www.viously.com/video/hls/{video_id}/index.m3u8', video_id, fatal=False)
if not formats:
continue
data = self._download_json(
f'https://www.viously.com/export/json/{video_id}', video_id,
transform_source=custom_decode, fatal=False)
yield {
'id': video_id,
'formats': formats,
**traverse_obj(data, ('video', {
'title': ('title', {str}),
'description': ('description', {str}),
'duration': ('duration', {int_or_none}),
'timestamp': ('iso_date', {parse_iso8601}),
'categories': ('category', 'name', {str}, {lambda x: [x] if x else None}),
})),
}

View File

@ -2068,7 +2068,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'title': 'Voyeur Girl', 'title': 'Voyeur Girl',
'description': 'md5:7ae382a65843d6df2685993e90a8628f', 'description': 'md5:7ae382a65843d6df2685993e90a8628f',
'upload_date': '20190312', 'upload_date': '20190312',
'artist': 'Stephen', 'artists': ['Stephen'],
'creators': ['Stephen'],
'track': 'Voyeur Girl', 'track': 'Voyeur Girl',
'album': 'it\'s too much love to know my dear', 'album': 'it\'s too much love to know my dear',
'release_date': '20190313', 'release_date': '20190313',
@ -2081,7 +2082,6 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
'channel': 'Stephen', # TODO: should be "Stephen - Topic" 'channel': 'Stephen', # TODO: should be "Stephen - Topic"
'uploader': 'Stephen', 'uploader': 'Stephen',
'availability': 'public', 'availability': 'public',
'creator': 'Stephen',
'duration': 169, 'duration': 169,
'thumbnail': 'https://i.ytimg.com/vi_webp/MgNrAu2pzNs/maxresdefault.webp', 'thumbnail': 'https://i.ytimg.com/vi_webp/MgNrAu2pzNs/maxresdefault.webp',
'age_limit': 0, 'age_limit': 0,
@ -4386,7 +4386,8 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
release_year = release_date[:4] release_year = release_date[:4]
info.update({ info.update({
'album': mobj.group('album'.strip()), 'album': mobj.group('album'.strip()),
'artist': mobj.group('clean_artist') or ', '.join(a.strip() for a in mobj.group('artist').split('·')), 'artists': ([a] if (a := mobj.group('clean_artist'))
else [a.strip() for a in mobj.group('artist').split('·')]),
'track': mobj.group('track').strip(), 'track': mobj.group('track').strip(),
'release_date': release_date, 'release_date': release_date,
'release_year': int_or_none(release_year), 'release_year': int_or_none(release_year),
@ -4532,7 +4533,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if mrr_title == 'Album': if mrr_title == 'Album':
info['album'] = mrr_contents_text info['album'] = mrr_contents_text
elif mrr_title == 'Artist': elif mrr_title == 'Artist':
info['artist'] = mrr_contents_text info['artists'] = [mrr_contents_text]
elif mrr_title == 'Song': elif mrr_title == 'Song':
info['track'] = mrr_contents_text info['track'] = mrr_contents_text
owner_badges = self._extract_badges(traverse_obj(vsir, ('owner', 'videoOwnerRenderer', 'badges'))) owner_badges = self._extract_badges(traverse_obj(vsir, ('owner', 'videoOwnerRenderer', 'badges')))
@ -4566,7 +4567,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
if fmt.get('protocol') == 'm3u8_native': if fmt.get('protocol') == 'm3u8_native':
fmt['__needs_testing'] = True fmt['__needs_testing'] = True
for s_k, d_k in [('artist', 'creator'), ('track', 'alt_title')]: for s_k, d_k in [('artists', 'creators'), ('track', 'alt_title')]:
v = info.get(s_k) v = info.get(s_k)
if v: if v:
info[d_k] = v info[d_k] = v

View File

@ -738,9 +738,10 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
def add(meta_list, info_list=None): def add(meta_list, info_list=None):
value = next(( value = next((
str(info[key]) for key in [f'{meta_prefix}_'] + list(variadic(info_list or meta_list)) info[key] for key in [f'{meta_prefix}_'] + list(variadic(info_list or meta_list))
if info.get(key) is not None), None) if info.get(key) is not None), None)
if value not in ('', None): if value not in ('', None):
value = ', '.join(map(str, variadic(value)))
value = value.replace('\0', '') # nul character cannot be passed in command line value = value.replace('\0', '') # nul character cannot be passed in command line
metadata['common'].update({meta_f: value for meta_f in variadic(meta_list)}) metadata['common'].update({meta_f: value for meta_f in variadic(meta_list)})
@ -754,10 +755,11 @@ class FFmpegMetadataPP(FFmpegPostProcessor):
add(('description', 'synopsis'), 'description') add(('description', 'synopsis'), 'description')
add(('purl', 'comment'), 'webpage_url') add(('purl', 'comment'), 'webpage_url')
add('track', 'track_number') add('track', 'track_number')
add('artist', ('artist', 'creator', 'uploader', 'uploader_id')) add('artist', ('artist', 'artists', 'creator', 'creators', 'uploader', 'uploader_id'))
add('genre') add('composer', ('composer', 'composers'))
add('genre', ('genre', 'genres'))
add('album') add('album')
add('album_artist') add('album_artist', ('album_artist', 'album_artists'))
add('disc', 'disc_number') add('disc', 'disc_number')
add('show', 'series') add('show', 'series')
add('season_number') add('season_number')

View File

@ -3,6 +3,7 @@ import contextlib
import inspect import inspect
import itertools import itertools
import re import re
import xml.etree.ElementTree
from ._utils import ( from ._utils import (
IDENTITY, IDENTITY,
@ -118,7 +119,7 @@ def traverse_obj(
branching = True branching = True
if isinstance(obj, collections.abc.Mapping): if isinstance(obj, collections.abc.Mapping):
result = obj.values() result = obj.values()
elif is_iterable_like(obj): elif is_iterable_like(obj) or isinstance(obj, xml.etree.ElementTree.Element):
result = obj result = obj
elif isinstance(obj, re.Match): elif isinstance(obj, re.Match):
result = obj.groups() result = obj.groups()
@ -132,7 +133,7 @@ def traverse_obj(
branching = True branching = True
if isinstance(obj, collections.abc.Mapping): if isinstance(obj, collections.abc.Mapping):
iter_obj = obj.items() iter_obj = obj.items()
elif is_iterable_like(obj): elif is_iterable_like(obj) or isinstance(obj, xml.etree.ElementTree.Element):
iter_obj = enumerate(obj) iter_obj = enumerate(obj)
elif isinstance(obj, re.Match): elif isinstance(obj, re.Match):
iter_obj = itertools.chain( iter_obj = itertools.chain(
@ -168,7 +169,7 @@ def traverse_obj(
result = next((v for k, v in obj.groupdict().items() if casefold(k) == key), None) result = next((v for k, v in obj.groupdict().items() if casefold(k) == key), None)
elif isinstance(key, (int, slice)): elif isinstance(key, (int, slice)):
if is_iterable_like(obj, collections.abc.Sequence): if is_iterable_like(obj, (collections.abc.Sequence, xml.etree.ElementTree.Element)):
branching = isinstance(key, slice) branching = isinstance(key, slice)
with contextlib.suppress(IndexError): with contextlib.suppress(IndexError):
result = obj[key] result = obj[key]
@ -176,6 +177,34 @@ def traverse_obj(
with contextlib.suppress(IndexError): with contextlib.suppress(IndexError):
result = str(obj)[key] result = str(obj)[key]
elif isinstance(obj, xml.etree.ElementTree.Element) and isinstance(key, str):
xpath, _, special = key.rpartition('/')
if not special.startswith('@') and special != 'text()':
xpath = key
special = None
# Allow abbreviations of relative paths, absolute paths error
if xpath.startswith('/'):
xpath = f'.{xpath}'
elif xpath and not xpath.startswith('./'):
xpath = f'./{xpath}'
def apply_specials(element):
if special is None:
return element
if special == '@':
return element.attrib
if special.startswith('@'):
return try_call(element.attrib.get, args=(special[1:],))
if special == 'text()':
return element.text
assert False, f'apply_specials is missing case for {special!r}'
if xpath:
result = list(map(apply_specials, obj.iterfind(xpath)))
else:
result = apply_specials(obj)
return branching, result if branching else (result,) return branching, result if branching else (result,)
def lazy_last(iterable): def lazy_last(iterable):