mirror of
https://github.com/yt-dlp/yt-dlp.git
synced 2024-11-28 10:11:25 +01:00
Compare commits
6 Commits
63b0c8e629
...
223d141ad4
Author | SHA1 | Date | |
---|---|---|---|
|
223d141ad4 | ||
|
beae2db127 | ||
|
3945677a75 | ||
|
b103aca24d | ||
|
5c7a5aaab2 | ||
|
422195ec70 |
12
README.md
12
README.md
|
@ -1553,9 +1553,9 @@ The available fields are:
|
||||||
|
|
||||||
All fields, unless specified otherwise, are sorted in descending order. To reverse this, prefix the field with a `+`. E.g. `+res` prefers format with the smallest resolution. Additionally, you can suffix a preferred value for the fields, separated by a `:`. E.g. `res:720` prefers larger videos, but no larger than 720p and the smallest video if there are no videos less than 720p. For `codec` and `ext`, you can provide two preferred values, the first for video and the second for audio. E.g. `+codec:avc:m4a` (equivalent to `+vcodec:avc,+acodec:m4a`) sets the video codec preference to `h264` > `h265` > `vp9` > `vp9.2` > `av01` > `vp8` > `h263` > `theora` and audio codec preference to `mp4a` > `aac` > `vorbis` > `opus` > `mp3` > `ac3` > `dts`. You can also make the sorting prefer the nearest values to the provided by using `~` as the delimiter. E.g. `filesize~1G` prefers the format with filesize closest to 1 GiB.
|
All fields, unless specified otherwise, are sorted in descending order. To reverse this, prefix the field with a `+`. E.g. `+res` prefers format with the smallest resolution. Additionally, you can suffix a preferred value for the fields, separated by a `:`. E.g. `res:720` prefers larger videos, but no larger than 720p and the smallest video if there are no videos less than 720p. For `codec` and `ext`, you can provide two preferred values, the first for video and the second for audio. E.g. `+codec:avc:m4a` (equivalent to `+vcodec:avc,+acodec:m4a`) sets the video codec preference to `h264` > `h265` > `vp9` > `vp9.2` > `av01` > `vp8` > `h263` > `theora` and audio codec preference to `mp4a` > `aac` > `vorbis` > `opus` > `mp3` > `ac3` > `dts`. You can also make the sorting prefer the nearest values to the provided by using `~` as the delimiter. E.g. `filesize~1G` prefers the format with filesize closest to 1 GiB.
|
||||||
|
|
||||||
The fields `hasvid` and `ie_pref` are always given highest priority in sorting, irrespective of the user-defined order. This behavior can be changed by using `--format-sort-force`. Apart from these, the default order used is: `lang,quality,res,fps,hdr:12,vcodec:vp9.2,channels,acodec,size,br,asr,proto,ext,hasaud,source,id`. The extractors may override this default order, but they cannot override the user-provided order.
|
The fields `hasvid` and `ie_pref` are always given highest priority in sorting, irrespective of the user-defined order. This behavior can be changed by using `--format-sort-force`. Apart from these, the default order used is: `lang,quality,res,fps,hdr:12,vcodec,channels,acodec,size,br,asr,proto,ext,hasaud,source,id`. The extractors may override this default order, but they cannot override the user-provided order.
|
||||||
|
|
||||||
Note that the default has `vcodec:vp9.2`; i.e. `av1` is not preferred. Similarly, the default for hdr is `hdr:12`; i.e. Dolby Vision is not preferred. These choices are made since DV and AV1 formats are not yet fully compatible with most devices. This may be changed in the future as more devices become capable of smoothly playing back these formats.
|
Note that the default for hdr is `hdr:12`; i.e. Dolby Vision is not preferred. This choice was made since DV formats are not yet fully compatible with most devices. This may be changed in the future.
|
||||||
|
|
||||||
If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all respects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`.
|
If your format selector is `worst`, the last item is selected after sorting. This means it will select the format that is worst in all respects. Most of the time, what you actually want is the video with the smallest filesize instead. So it is generally better to use `-f best -S +size,+br,+res,+fps`.
|
||||||
|
|
||||||
|
@ -2205,7 +2205,7 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
|
||||||
* `avconv` is not supported as an alternative to `ffmpeg`
|
* `avconv` is not supported as an alternative to `ffmpeg`
|
||||||
* yt-dlp stores config files in slightly different locations to youtube-dl. See [CONFIGURATION](#configuration) for a list of correct locations
|
* yt-dlp stores config files in slightly different locations to youtube-dl. See [CONFIGURATION](#configuration) for a list of correct locations
|
||||||
* The default [output template](#output-template) is `%(title)s [%(id)s].%(ext)s`. There is no real reason for this change. This was changed before yt-dlp was ever made public and now there are no plans to change it back to `%(title)s-%(id)s.%(ext)s`. Instead, you may use `--compat-options filename`
|
* The default [output template](#output-template) is `%(title)s [%(id)s].%(ext)s`. There is no real reason for this change. This was changed before yt-dlp was ever made public and now there are no plans to change it back to `%(title)s-%(id)s.%(ext)s`. Instead, you may use `--compat-options filename`
|
||||||
* The default [format sorting](#sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order
|
* The default [format sorting](#sorting-formats) is different from youtube-dl and prefers higher resolution and better codecs rather than higher bitrates. You can use the `--format-sort` option to change this to any order you prefer, or use `--compat-options format-sort` to use youtube-dl's sorting order. Older versions of yt-dlp preferred VP9 due to its broader compatibility; you can use `--compat-options prefer-vp9-sort` to revert to that format sorting preference. These two compat options cannot be used together
|
||||||
* The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be preferred. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this
|
* The default format selector is `bv*+ba/b`. This means that if a combined video + audio format that is better than the best video-only format is found, the former will be preferred. Use `-f bv+ba/b` or `--compat-options format-spec` to revert this
|
||||||
* Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both
|
* Unlike youtube-dlc, yt-dlp does not allow merging multiple audio/video streams into one file by default (since this conflicts with the use of `-f bv*+ba`). If needed, this feature must be enabled using `--audio-multistreams` and `--video-multistreams`. You can also use `--compat-options multistreams` to enable both
|
||||||
* `--no-abort-on-error` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
|
* `--no-abort-on-error` is enabled by default. Use `--abort-on-error` or `--compat-options abort-on-error` to abort on errors instead
|
||||||
|
@ -2234,11 +2234,11 @@ Some of yt-dlp's default options are different from that of youtube-dl and youtu
|
||||||
For ease of use, a few more compat options are available:
|
For ease of use, a few more compat options are available:
|
||||||
|
|
||||||
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
|
* `--compat-options all`: Use all compat options (**Do NOT use this!**)
|
||||||
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext`
|
* `--compat-options youtube-dl`: Same as `--compat-options all,-multistreams,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
||||||
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext`
|
* `--compat-options youtube-dlc`: Same as `--compat-options all,-no-live-chat,-no-youtube-channel-redirect,-playlist-match-filter,-manifest-filesize-approx,-allow-unsafe-ext,-prefer-vp9-sort`
|
||||||
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
|
* `--compat-options 2021`: Same as `--compat-options 2022,no-certifi,filename-sanitization,no-youtube-prefer-utc-upload-date`
|
||||||
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
|
* `--compat-options 2022`: Same as `--compat-options 2023,playlist-match-filter,no-external-downloader-progress,prefer-legacy-http-handler,manifest-filesize-approx`
|
||||||
* `--compat-options 2023`: Currently does nothing. Use this to enable all future compat options
|
* `--compat-options 2023`: Same as `--compat-options prefer-vp9-sort`. Use this to enable all future compat options
|
||||||
|
|
||||||
The following compat options restore vulnerable behavior from before security patches:
|
The following compat options restore vulnerable behavior from before security patches:
|
||||||
|
|
||||||
|
|
|
@ -82,6 +82,18 @@ class TestAES(unittest.TestCase):
|
||||||
data, bytes(self.key), authentication_tag, bytes(self.iv[:12]))
|
data, bytes(self.key), authentication_tag, bytes(self.iv[:12]))
|
||||||
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
|
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg)
|
||||||
|
|
||||||
|
def test_gcm_aligned_decrypt(self):
|
||||||
|
data = b'\x159Y\xcf5eud\x90\x9c\x85&]\x14\x1d\x0f'
|
||||||
|
authentication_tag = b'\x08\xb1\x9d!&\x98\xd0\xeaRq\x90\xe6;\xb5]\xd8'
|
||||||
|
|
||||||
|
decrypted = bytes(aes_gcm_decrypt_and_verify(
|
||||||
|
list(data), self.key, list(authentication_tag), self.iv[:12]))
|
||||||
|
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg[:16])
|
||||||
|
if Cryptodome.AES:
|
||||||
|
decrypted = aes_gcm_decrypt_and_verify_bytes(
|
||||||
|
data, bytes(self.key), authentication_tag, bytes(self.iv[:12]))
|
||||||
|
self.assertEqual(decrypted.rstrip(b'\x08'), self.secret_msg[:16])
|
||||||
|
|
||||||
def test_decrypt_text(self):
|
def test_decrypt_text(self):
|
||||||
password = bytes(self.key).decode()
|
password = bytes(self.key).decode()
|
||||||
encrypted = base64.b64encode(
|
encrypted = base64.b64encode(
|
||||||
|
|
|
@ -9,13 +9,17 @@ from yt_dlp.utils import (
|
||||||
determine_ext,
|
determine_ext,
|
||||||
dict_get,
|
dict_get,
|
||||||
int_or_none,
|
int_or_none,
|
||||||
|
join_nonempty,
|
||||||
str_or_none,
|
str_or_none,
|
||||||
)
|
)
|
||||||
from yt_dlp.utils.traversal import (
|
from yt_dlp.utils.traversal import (
|
||||||
|
find_element,
|
||||||
|
find_elements,
|
||||||
require,
|
require,
|
||||||
subs_list_to_dict,
|
subs_list_to_dict,
|
||||||
traverse_obj,
|
traverse_obj,
|
||||||
trim_str,
|
trim_str,
|
||||||
|
unpack,
|
||||||
)
|
)
|
||||||
|
|
||||||
_TEST_DATA = {
|
_TEST_DATA = {
|
||||||
|
@ -35,6 +39,14 @@ _TEST_DATA = {
|
||||||
'dict': {},
|
'dict': {},
|
||||||
}
|
}
|
||||||
|
|
||||||
|
_TEST_HTML = '''<html><body>
|
||||||
|
<div class="a">1</div>
|
||||||
|
<div class="a" id="x" custom="z">2</div>
|
||||||
|
<div class="b" data-id="y" custom="z">3</div>
|
||||||
|
<p class="a">4</p>
|
||||||
|
<p id="d" custom="e">5</p>
|
||||||
|
</body></html>'''
|
||||||
|
|
||||||
|
|
||||||
class TestTraversal:
|
class TestTraversal:
|
||||||
def test_traversal_base(self):
|
def test_traversal_base(self):
|
||||||
|
@ -510,6 +522,59 @@ class TestTraversalHelpers:
|
||||||
assert trim_str(start='abc', end='abc')('abc') == ''
|
assert trim_str(start='abc', end='abc')('abc') == ''
|
||||||
assert trim_str(start='', end='')('abc') == 'abc'
|
assert trim_str(start='', end='')('abc') == 'abc'
|
||||||
|
|
||||||
|
def test_unpack(self):
|
||||||
|
assert unpack(lambda *x: ''.join(map(str, x)))([1, 2, 3]) == '123'
|
||||||
|
assert unpack(join_nonempty)([1, 2, 3]) == '1-2-3'
|
||||||
|
assert unpack(join_nonempty(delim=' '))([1, 2, 3]) == '1 2 3'
|
||||||
|
with pytest.raises(TypeError):
|
||||||
|
unpack(join_nonempty)()
|
||||||
|
with pytest.raises(TypeError):
|
||||||
|
unpack()
|
||||||
|
|
||||||
|
def test_find_element(self):
|
||||||
|
for improper_kwargs in [
|
||||||
|
dict(attr='data-id'),
|
||||||
|
dict(value='y'),
|
||||||
|
dict(attr='data-id', value='y', cls='a'),
|
||||||
|
dict(attr='data-id', value='y', id='x'),
|
||||||
|
dict(cls='a', id='x'),
|
||||||
|
dict(cls='a', tag='p'),
|
||||||
|
dict(cls='[ab]', regex=True),
|
||||||
|
]:
|
||||||
|
with pytest.raises(AssertionError):
|
||||||
|
find_element(**improper_kwargs)(_TEST_HTML)
|
||||||
|
|
||||||
|
assert find_element(cls='a')(_TEST_HTML) == '1'
|
||||||
|
assert find_element(cls='a', html=True)(_TEST_HTML) == '<div class="a">1</div>'
|
||||||
|
assert find_element(id='x')(_TEST_HTML) == '2'
|
||||||
|
assert find_element(id='[ex]')(_TEST_HTML) is None
|
||||||
|
assert find_element(id='[ex]', regex=True)(_TEST_HTML) == '2'
|
||||||
|
assert find_element(id='x', html=True)(_TEST_HTML) == '<div class="a" id="x" custom="z">2</div>'
|
||||||
|
assert find_element(attr='data-id', value='y')(_TEST_HTML) == '3'
|
||||||
|
assert find_element(attr='data-id', value='y(?:es)?')(_TEST_HTML) is None
|
||||||
|
assert find_element(attr='data-id', value='y(?:es)?', regex=True)(_TEST_HTML) == '3'
|
||||||
|
assert find_element(
|
||||||
|
attr='data-id', value='y', html=True)(_TEST_HTML) == '<div class="b" data-id="y" custom="z">3</div>'
|
||||||
|
|
||||||
|
def test_find_elements(self):
|
||||||
|
for improper_kwargs in [
|
||||||
|
dict(tag='p'),
|
||||||
|
dict(attr='data-id'),
|
||||||
|
dict(value='y'),
|
||||||
|
dict(attr='data-id', value='y', cls='a'),
|
||||||
|
dict(cls='a', tag='div'),
|
||||||
|
dict(cls='[ab]', regex=True),
|
||||||
|
]:
|
||||||
|
with pytest.raises(AssertionError):
|
||||||
|
find_elements(**improper_kwargs)(_TEST_HTML)
|
||||||
|
|
||||||
|
assert find_elements(cls='a')(_TEST_HTML) == ['1', '2', '4']
|
||||||
|
assert find_elements(cls='a', html=True)(_TEST_HTML) == [
|
||||||
|
'<div class="a">1</div>', '<div class="a" id="x" custom="z">2</div>', '<p class="a">4</p>']
|
||||||
|
assert find_elements(attr='custom', value='z')(_TEST_HTML) == ['2', '3']
|
||||||
|
assert find_elements(attr='custom', value='[ez]')(_TEST_HTML) == []
|
||||||
|
assert find_elements(attr='custom', value='[ez]', regex=True)(_TEST_HTML) == ['2', '3', '5']
|
||||||
|
|
||||||
|
|
||||||
class TestDictGet:
|
class TestDictGet:
|
||||||
def test_dict_get(self):
|
def test_dict_get(self):
|
||||||
|
|
|
@ -469,7 +469,7 @@ class YoutubeDL:
|
||||||
The following options do not work when used through the API:
|
The following options do not work when used through the API:
|
||||||
filename, abort-on-error, multistreams, no-live-chat,
|
filename, abort-on-error, multistreams, no-live-chat,
|
||||||
format-sort, no-clean-infojson, no-playlist-metafiles,
|
format-sort, no-clean-infojson, no-playlist-metafiles,
|
||||||
no-keep-subs, no-attach-info-json, allow-unsafe-ext.
|
no-keep-subs, no-attach-info-json, allow-unsafe-ext, prefer-vp9-sort.
|
||||||
Refer __init__.py for their implementation
|
Refer __init__.py for their implementation
|
||||||
progress_template: Dictionary of templates for progress outputs.
|
progress_template: Dictionary of templates for progress outputs.
|
||||||
Allowed keys are 'download', 'postprocess',
|
Allowed keys are 'download', 'postprocess',
|
||||||
|
|
|
@ -157,6 +157,9 @@ def set_compat_opts(opts):
|
||||||
opts.embed_infojson = False
|
opts.embed_infojson = False
|
||||||
if 'format-sort' in opts.compat_opts:
|
if 'format-sort' in opts.compat_opts:
|
||||||
opts.format_sort.extend(FormatSorter.ytdl_default)
|
opts.format_sort.extend(FormatSorter.ytdl_default)
|
||||||
|
elif 'prefer-vp9-sort' in opts.compat_opts:
|
||||||
|
opts.format_sort.extend(FormatSorter._prefer_vp9_sort)
|
||||||
|
|
||||||
_video_multistreams_set = set_default_compat('multistreams', 'allow_multiple_video_streams', False, remove_compat=False)
|
_video_multistreams_set = set_default_compat('multistreams', 'allow_multiple_video_streams', False, remove_compat=False)
|
||||||
_audio_multistreams_set = set_default_compat('multistreams', 'allow_multiple_audio_streams', False, remove_compat=False)
|
_audio_multistreams_set = set_default_compat('multistreams', 'allow_multiple_audio_streams', False, remove_compat=False)
|
||||||
if _video_multistreams_set is False and _audio_multistreams_set is False:
|
if _video_multistreams_set is False and _audio_multistreams_set is False:
|
||||||
|
|
|
@ -229,12 +229,12 @@ def aes_gcm_decrypt_and_verify(data, key, tag, nonce):
|
||||||
iv_ctr = inc(j0)
|
iv_ctr = inc(j0)
|
||||||
|
|
||||||
decrypted_data = aes_ctr_decrypt(data, key, iv_ctr + [0] * (BLOCK_SIZE_BYTES - len(iv_ctr)))
|
decrypted_data = aes_ctr_decrypt(data, key, iv_ctr + [0] * (BLOCK_SIZE_BYTES - len(iv_ctr)))
|
||||||
pad_len = len(data) // 16 * 16
|
pad_len = (BLOCK_SIZE_BYTES - (len(data) % BLOCK_SIZE_BYTES)) % BLOCK_SIZE_BYTES
|
||||||
s_tag = ghash(
|
s_tag = ghash(
|
||||||
hash_subkey,
|
hash_subkey,
|
||||||
data
|
data
|
||||||
+ [0] * (BLOCK_SIZE_BYTES - len(data) + pad_len) # pad
|
+ [0] * pad_len # pad
|
||||||
+ list((0 * 8).to_bytes(8, 'big') # length of associated data
|
+ list((0 * 8).to_bytes(8, 'big') # length of associated data
|
||||||
+ ((len(data) * 8).to_bytes(8, 'big'))), # length of data
|
+ ((len(data) * 8).to_bytes(8, 'big'))), # length of data
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
|
@ -278,6 +278,7 @@ from .bleacherreport import (
|
||||||
from .blerp import BlerpIE
|
from .blerp import BlerpIE
|
||||||
from .blogger import BloggerIE
|
from .blogger import BloggerIE
|
||||||
from .bloomberg import BloombergIE
|
from .bloomberg import BloombergIE
|
||||||
|
from .bluesky import BlueskyIE
|
||||||
from .bokecc import BokeCCIE
|
from .bokecc import BokeCCIE
|
||||||
from .bongacams import BongaCamsIE
|
from .bongacams import BongaCamsIE
|
||||||
from .boosty import BoostyIE
|
from .boosty import BoostyIE
|
||||||
|
|
388
yt_dlp/extractor/bluesky.py
Normal file
388
yt_dlp/extractor/bluesky.py
Normal file
|
@ -0,0 +1,388 @@
|
||||||
|
from .common import InfoExtractor
|
||||||
|
from ..utils import (
|
||||||
|
ExtractorError,
|
||||||
|
format_field,
|
||||||
|
int_or_none,
|
||||||
|
mimetype2ext,
|
||||||
|
orderedSet,
|
||||||
|
parse_iso8601,
|
||||||
|
truncate_string,
|
||||||
|
update_url_query,
|
||||||
|
url_basename,
|
||||||
|
url_or_none,
|
||||||
|
variadic,
|
||||||
|
)
|
||||||
|
from ..utils.traversal import traverse_obj
|
||||||
|
|
||||||
|
|
||||||
|
class BlueskyIE(InfoExtractor):
|
||||||
|
_VALID_URL = [
|
||||||
|
r'https?://(?:www\.)?(?:bsky\.app|main\.bsky\.dev)/profile/(?P<handle>[\w.:%-]+)/post/(?P<id>\w+)',
|
||||||
|
r'at://(?P<handle>[\w.:%-]+)/app\.bsky\.feed\.post/(?P<id>\w+)',
|
||||||
|
]
|
||||||
|
_TESTS = [{
|
||||||
|
'url': 'https://bsky.app/profile/blu3blue.bsky.social/post/3l4omssdl632g',
|
||||||
|
'md5': '375539c1930ab05d15585ed772ab54fd',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l4omssdl632g',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'Blu3Blu3Lilith',
|
||||||
|
'uploader_id': 'blu3blue.bsky.social',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/blu3blue.bsky.social',
|
||||||
|
'channel_id': 'did:plc:pzdr5ylumf7vmvwasrpr5bf2',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:pzdr5ylumf7vmvwasrpr5bf2',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'title': 'OMG WE HAVE VIDEOS NOW',
|
||||||
|
'description': 'OMG WE HAVE VIDEOS NOW',
|
||||||
|
'upload_date': '20240921',
|
||||||
|
'timestamp': 1726940605,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'tags': [],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://bsky.app/profile/bsky.app/post/3l3vgf77uco2g',
|
||||||
|
'md5': 'b9e344fdbce9f2852c668a97efefb105',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l3vgf77uco2g',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'Bluesky',
|
||||||
|
'uploader_id': 'bsky.app',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/bsky.app',
|
||||||
|
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'title': 'Bluesky now has video! Update your app to versi...',
|
||||||
|
'alt_title': 'Bluesky video feature announcement',
|
||||||
|
'description': r're:(?s)Bluesky now has video! .{239}',
|
||||||
|
'upload_date': '20240911',
|
||||||
|
'timestamp': 1726074716,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'tags': [],
|
||||||
|
'subtitles': {
|
||||||
|
'en': 'mincount:1',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://main.bsky.dev/profile/souris.moe/post/3l4qhp7bcs52c',
|
||||||
|
'md5': '5f2df8c200b5633eb7fb2c984d29772f',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l4qhp7bcs52c',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'souris',
|
||||||
|
'uploader_id': 'souris.moe',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/souris.moe',
|
||||||
|
'channel_id': 'did:plc:tj7g244gl5v6ai6cm4f4wlqp',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:tj7g244gl5v6ai6cm4f4wlqp',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'title': 'Bluesky video #3l4qhp7bcs52c',
|
||||||
|
'upload_date': '20240922',
|
||||||
|
'timestamp': 1727003838,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'tags': [],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://bsky.app/profile/de1.pds.tentacle.expert/post/3l3w4tnezek2e',
|
||||||
|
'md5': '1af9c7fda061cf7593bbffca89e43d1c',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l3w4tnezek2e',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'clean',
|
||||||
|
'uploader_id': 'de1.pds.tentacle.expert',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/de1.pds.tentacle.expert',
|
||||||
|
'channel_id': 'did:web:de1.tentacle.expert',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:web:de1.tentacle.expert',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'title': 'Bluesky video #3l3w4tnezek2e',
|
||||||
|
'upload_date': '20240911',
|
||||||
|
'timestamp': 1726098823,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'tags': [],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://bsky.app/profile/yunayuispink.bsky.social/post/3l7gqcfes742o',
|
||||||
|
'info_dict': {
|
||||||
|
'id': 'XxK3t_5V3ao',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'yunayu',
|
||||||
|
'uploader_id': '@yunayuispink',
|
||||||
|
'uploader_url': 'https://www.youtube.com/@yunayuispink',
|
||||||
|
'channel': 'yunayu',
|
||||||
|
'channel_id': 'UCPLvXnHa7lTyNoR_dGsU14w',
|
||||||
|
'channel_url': 'https://www.youtube.com/channel/UCPLvXnHa7lTyNoR_dGsU14w',
|
||||||
|
'thumbnail': 'https://i.ytimg.com/vi_webp/XxK3t_5V3ao/maxresdefault.webp',
|
||||||
|
'description': r're:Have a good goodx10000day',
|
||||||
|
'title': '5min vs 5hours drawing',
|
||||||
|
'availability': 'public',
|
||||||
|
'live_status': 'not_live',
|
||||||
|
'playable_in_embed': True,
|
||||||
|
'upload_date': '20241026',
|
||||||
|
'timestamp': 1729967784,
|
||||||
|
'duration': 321,
|
||||||
|
'age_limit': 0,
|
||||||
|
'like_count': int,
|
||||||
|
'view_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'channel_follower_count': int,
|
||||||
|
'categories': ['Entertainment'],
|
||||||
|
'tags': [],
|
||||||
|
},
|
||||||
|
'add_ie': ['Youtube'],
|
||||||
|
}, {
|
||||||
|
'url': 'https://bsky.app/profile/endshark.bsky.social/post/3jzxjkcemae2m',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '222792849',
|
||||||
|
'ext': 'mp3',
|
||||||
|
'uploader': 'LASERBAT',
|
||||||
|
'uploader_id': 'laserbatx',
|
||||||
|
'uploader_url': 'https://laserbatx.bandcamp.com',
|
||||||
|
'artists': ['LASERBAT'],
|
||||||
|
'album_artists': ['LASERBAT'],
|
||||||
|
'album': 'Hari Nezumi [EP]',
|
||||||
|
'track': 'Forward to the End',
|
||||||
|
'title': 'LASERBAT - Forward to the End',
|
||||||
|
'thumbnail': 'https://f4.bcbits.com/img/a2507705510_5.jpg',
|
||||||
|
'duration': 228.571,
|
||||||
|
'track_id': '222792849',
|
||||||
|
'release_date': '20230423',
|
||||||
|
'upload_date': '20230423',
|
||||||
|
'timestamp': 1682276040.0,
|
||||||
|
'release_timestamp': 1682276040.0,
|
||||||
|
'track_number': 1,
|
||||||
|
},
|
||||||
|
'add_ie': ['Bandcamp'],
|
||||||
|
}, {
|
||||||
|
'url': 'https://bsky.app/profile/dannybhoix.bsky.social/post/3l6oe5mtr2c2j',
|
||||||
|
'md5': 'b9e344fdbce9f2852c668a97efefb105',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l3vgf77uco2g',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'Bluesky',
|
||||||
|
'uploader_id': 'bsky.app',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/bsky.app',
|
||||||
|
'channel_id': 'did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:z72i7hdynmk6r22z27h6tvur',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'title': 'Bluesky now has video! Update your app to versi...',
|
||||||
|
'alt_title': 'Bluesky video feature announcement',
|
||||||
|
'description': r're:(?s)Bluesky now has video! .{239}',
|
||||||
|
'upload_date': '20240911',
|
||||||
|
'timestamp': 1726074716,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'tags': [],
|
||||||
|
'subtitles': {
|
||||||
|
'en': 'mincount:1',
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://bsky.app/profile/alt.bun.how/post/3l7rdfxhyds2f',
|
||||||
|
'md5': '8775118b235cf9fa6b5ad30f95cda75c',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l7rdfxhyds2f',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'cinnamon',
|
||||||
|
'uploader_id': 'alt.bun.how',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/alt.bun.how',
|
||||||
|
'channel_id': 'did:plc:7x6rtuenkuvxq3zsvffp2ide',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:7x6rtuenkuvxq3zsvffp2ide',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'title': 'crazy that i look like this tbh',
|
||||||
|
'description': 'crazy that i look like this tbh',
|
||||||
|
'upload_date': '20241030',
|
||||||
|
'timestamp': 1730332128,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'tags': ['sexual'],
|
||||||
|
'age_limit': 18,
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'at://did:plc:ia76kvnndjutgedggx2ibrem/app.bsky.feed.post/3l6zrz6zyl2dr',
|
||||||
|
'md5': '71b0eb6d85d03145e6af6642c7fc6d78',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l6zrz6zyl2dr',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'uploader': 'mary🐇',
|
||||||
|
'uploader_id': 'mary.my.id',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/mary.my.id',
|
||||||
|
'channel_id': 'did:plc:ia76kvnndjutgedggx2ibrem',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:ia76kvnndjutgedggx2ibrem',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'title': 'Bluesky video #3l6zrz6zyl2dr',
|
||||||
|
'upload_date': '20241021',
|
||||||
|
'timestamp': 1729523172,
|
||||||
|
'like_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
'comment_count': int,
|
||||||
|
'tags': [],
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'url': 'https://bsky.app/profile/purpleicetea.bsky.social/post/3l7gv55dc2o2w',
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l7gv55dc2o2w',
|
||||||
|
},
|
||||||
|
'playlist': [{
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l7gv55dc2o2w',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'upload_date': '20241026',
|
||||||
|
'description': 'One of my favorite videos',
|
||||||
|
'comment_count': int,
|
||||||
|
'uploader_url': 'https://bsky.app/profile/purpleicetea.bsky.social',
|
||||||
|
'uploader': 'Purple.Ice.Tea',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:bjh5ffwya5f53dfy47dezuwx',
|
||||||
|
'like_count': int,
|
||||||
|
'channel_id': 'did:plc:bjh5ffwya5f53dfy47dezuwx',
|
||||||
|
'repost_count': int,
|
||||||
|
'timestamp': 1729973202,
|
||||||
|
'tags': [],
|
||||||
|
'uploader_id': 'purpleicetea.bsky.social',
|
||||||
|
'title': 'One of my favorite videos',
|
||||||
|
},
|
||||||
|
}, {
|
||||||
|
'info_dict': {
|
||||||
|
'id': '3l77u64l7le2e',
|
||||||
|
'ext': 'mp4',
|
||||||
|
'title': 'hearing people on twitter say that bluesky isn\'...',
|
||||||
|
'like_count': int,
|
||||||
|
'uploader_id': 'thafnine.net',
|
||||||
|
'uploader_url': 'https://bsky.app/profile/thafnine.net',
|
||||||
|
'upload_date': '20241024',
|
||||||
|
'channel_url': 'https://bsky.app/profile/did:plc:6ttyq36rhiyed7wu3ws7dmqj',
|
||||||
|
'description': r're:(?s)hearing people on twitter say that bluesky .{93}',
|
||||||
|
'tags': [],
|
||||||
|
'alt_title': 'md5:9b1ee1937fb3d1a81e932f9ec14d560e',
|
||||||
|
'uploader': 'T9',
|
||||||
|
'channel_id': 'did:plc:6ttyq36rhiyed7wu3ws7dmqj',
|
||||||
|
'thumbnail': r're:https://video.bsky.app/watch/.*\.jpg$',
|
||||||
|
'timestamp': 1729731642,
|
||||||
|
'comment_count': int,
|
||||||
|
'repost_count': int,
|
||||||
|
},
|
||||||
|
}],
|
||||||
|
}]
|
||||||
|
_BLOB_URL_TMPL = '{}/xrpc/com.atproto.sync.getBlob'
|
||||||
|
|
||||||
|
def _get_service_endpoint(self, did, video_id):
|
||||||
|
if did.startswith('did:web:'):
|
||||||
|
url = f'https://{did[8:]}/.well-known/did.json'
|
||||||
|
else:
|
||||||
|
url = f'https://plc.directory/{did}'
|
||||||
|
services = self._download_json(
|
||||||
|
url, video_id, 'Fetching service endpoint', 'Falling back to bsky.social', fatal=False)
|
||||||
|
return traverse_obj(
|
||||||
|
services, ('service', lambda _, x: x['type'] == 'AtprotoPersonalDataServer',
|
||||||
|
'serviceEndpoint', {url_or_none}, any)) or 'https://bsky.social'
|
||||||
|
|
||||||
|
def _real_extract(self, url):
|
||||||
|
handle, video_id = self._match_valid_url(url).group('handle', 'id')
|
||||||
|
|
||||||
|
post = self._download_json(
|
||||||
|
'https://public.api.bsky.app/xrpc/app.bsky.feed.getPostThread',
|
||||||
|
video_id, query={
|
||||||
|
'uri': f'at://{handle}/app.bsky.feed.post/{video_id}',
|
||||||
|
'depth': 0,
|
||||||
|
'parentHeight': 0,
|
||||||
|
})['thread']['post']
|
||||||
|
|
||||||
|
entries = []
|
||||||
|
# app.bsky.embed.video.view/app.bsky.embed.external.view
|
||||||
|
entries.extend(self._extract_videos(post, video_id))
|
||||||
|
# app.bsky.embed.recordWithMedia.view
|
||||||
|
entries.extend(self._extract_videos(
|
||||||
|
post, video_id, embed_path=('embed', 'media'), record_subpath=('embed', 'media')))
|
||||||
|
# app.bsky.embed.record.view
|
||||||
|
if nested_post := traverse_obj(post, ('embed', 'record', ('record', None), {dict}, any)):
|
||||||
|
entries.extend(self._extract_videos(
|
||||||
|
nested_post, video_id, embed_path=('embeds', 0), record_path='value'))
|
||||||
|
|
||||||
|
if not entries:
|
||||||
|
raise ExtractorError('No video could be found in this post', expected=True)
|
||||||
|
if len(entries) == 1:
|
||||||
|
return entries[0]
|
||||||
|
return self.playlist_result(entries, video_id)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _build_profile_url(path):
|
||||||
|
return format_field(path, None, 'https://bsky.app/profile/%s', default=None)
|
||||||
|
|
||||||
|
def _extract_videos(self, root, video_id, embed_path='embed', record_path='record', record_subpath='embed'):
|
||||||
|
embed_path = variadic(embed_path, (str, bytes, dict, set))
|
||||||
|
record_path = variadic(record_path, (str, bytes, dict, set))
|
||||||
|
record_subpath = variadic(record_subpath, (str, bytes, dict, set))
|
||||||
|
|
||||||
|
entries = []
|
||||||
|
if external_uri := traverse_obj(root, (
|
||||||
|
((*record_path, *record_subpath), embed_path), 'external', 'uri', {url_or_none}, any)):
|
||||||
|
entries.append(self.url_result(external_uri))
|
||||||
|
if playlist := traverse_obj(root, (*embed_path, 'playlist', {url_or_none})):
|
||||||
|
formats, subtitles = self._extract_m3u8_formats_and_subtitles(
|
||||||
|
playlist, video_id, 'mp4', m3u8_id='hls', fatal=False)
|
||||||
|
else:
|
||||||
|
return entries
|
||||||
|
|
||||||
|
video_cid = traverse_obj(
|
||||||
|
root, (*embed_path, 'cid', {str}),
|
||||||
|
(*record_path, *record_subpath, 'video', 'ref', '$link', {str}))
|
||||||
|
did = traverse_obj(root, ('author', 'did', {str}))
|
||||||
|
|
||||||
|
if did and video_cid:
|
||||||
|
endpoint = self._get_service_endpoint(did, video_id)
|
||||||
|
|
||||||
|
formats.append({
|
||||||
|
'format_id': 'blob',
|
||||||
|
'url': update_url_query(
|
||||||
|
self._BLOB_URL_TMPL.format(endpoint), {'did': did, 'cid': video_cid}),
|
||||||
|
**traverse_obj(root, (*embed_path, 'aspectRatio', {
|
||||||
|
'width': ('width', {int_or_none}),
|
||||||
|
'height': ('height', {int_or_none}),
|
||||||
|
})),
|
||||||
|
**traverse_obj(root, (*record_path, *record_subpath, 'video', {
|
||||||
|
'filesize': ('size', {int_or_none}),
|
||||||
|
'ext': ('mimeType', {mimetype2ext}),
|
||||||
|
})),
|
||||||
|
})
|
||||||
|
|
||||||
|
for sub_data in traverse_obj(root, (
|
||||||
|
*record_path, *record_subpath, 'captions', lambda _, v: v['file']['ref']['$link'])):
|
||||||
|
subtitles.setdefault(sub_data.get('lang') or 'und', []).append({
|
||||||
|
'url': update_url_query(
|
||||||
|
self._BLOB_URL_TMPL.format(endpoint), {'did': did, 'cid': sub_data['file']['ref']['$link']}),
|
||||||
|
'ext': traverse_obj(sub_data, ('file', 'mimeType', {mimetype2ext})),
|
||||||
|
})
|
||||||
|
|
||||||
|
entries.append({
|
||||||
|
'id': video_id,
|
||||||
|
'formats': formats,
|
||||||
|
'subtitles': subtitles,
|
||||||
|
**traverse_obj(root, {
|
||||||
|
'id': ('uri', {url_basename}),
|
||||||
|
'thumbnail': (*embed_path, 'thumbnail', {url_or_none}),
|
||||||
|
'alt_title': (*embed_path, 'alt', {str}, filter),
|
||||||
|
'uploader': ('author', 'displayName', {str}),
|
||||||
|
'uploader_id': ('author', 'handle', {str}),
|
||||||
|
'uploader_url': ('author', 'handle', {self._build_profile_url}),
|
||||||
|
'channel_id': ('author', 'did', {str}),
|
||||||
|
'channel_url': ('author', 'did', {self._build_profile_url}),
|
||||||
|
'like_count': ('likeCount', {int_or_none}),
|
||||||
|
'repost_count': ('repostCount', {int_or_none}),
|
||||||
|
'comment_count': ('replyCount', {int_or_none}),
|
||||||
|
'timestamp': ('indexedAt', {parse_iso8601}),
|
||||||
|
'tags': ('labels', ..., 'val', {str}, all, {orderedSet}),
|
||||||
|
'age_limit': (
|
||||||
|
'labels', ..., 'val', {lambda x: 18 if x in ('sexual', 'porn', 'graphic-media') else None}, any),
|
||||||
|
'description': (*record_path, 'text', {str}, filter),
|
||||||
|
'title': (*record_path, 'text', {lambda x: x.replace('\n', '')}, {truncate_string(left=50)}),
|
||||||
|
}),
|
||||||
|
})
|
||||||
|
return entries
|
|
@ -4777,7 +4777,7 @@ class YoutubeIE(YoutubeBaseInfoExtractor):
|
||||||
'live_status': live_status,
|
'live_status': live_status,
|
||||||
'release_timestamp': live_start_time,
|
'release_timestamp': live_start_time,
|
||||||
'_format_sort_fields': ( # source_preference is lower for potentially damaged formats
|
'_format_sort_fields': ( # source_preference is lower for potentially damaged formats
|
||||||
'quality', 'res', 'fps', 'hdr:12', 'source', 'vcodec:vp9.2', 'channels', 'acodec', 'lang', 'proto'),
|
'quality', 'res', 'fps', 'hdr:12', 'source', 'vcodec', 'channels', 'acodec', 'lang', 'proto'),
|
||||||
}
|
}
|
||||||
|
|
||||||
subtitles = {}
|
subtitles = {}
|
||||||
|
@ -7858,7 +7858,7 @@ class YoutubeClipIE(YoutubeTabBaseInfoExtractor):
|
||||||
'section_start': int(clip_data['startTimeMs']) / 1000,
|
'section_start': int(clip_data['startTimeMs']) / 1000,
|
||||||
'section_end': int(clip_data['endTimeMs']) / 1000,
|
'section_end': int(clip_data['endTimeMs']) / 1000,
|
||||||
'_format_sort_fields': ( # https protocol is prioritized for ffmpeg compatibility
|
'_format_sort_fields': ( # https protocol is prioritized for ffmpeg compatibility
|
||||||
'proto:https', 'quality', 'res', 'fps', 'hdr:12', 'source', 'vcodec:vp9.2', 'channels', 'acodec', 'lang'),
|
'proto:https', 'quality', 'res', 'fps', 'hdr:12', 'source', 'vcodec', 'channels', 'acodec', 'lang'),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -483,13 +483,13 @@ def create_parser():
|
||||||
'no-attach-info-json', 'embed-thumbnail-atomicparsley', 'no-external-downloader-progress',
|
'no-attach-info-json', 'embed-thumbnail-atomicparsley', 'no-external-downloader-progress',
|
||||||
'embed-metadata', 'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi',
|
'embed-metadata', 'seperate-video-versions', 'no-clean-infojson', 'no-keep-subs', 'no-certifi',
|
||||||
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-youtube-prefer-utc-upload-date',
|
'no-youtube-channel-redirect', 'no-youtube-unavailable-videos', 'no-youtube-prefer-utc-upload-date',
|
||||||
'prefer-legacy-http-handler', 'manifest-filesize-approx', 'allow-unsafe-ext',
|
'prefer-legacy-http-handler', 'manifest-filesize-approx', 'allow-unsafe-ext', 'prefer-vp9-sort',
|
||||||
}, 'aliases': {
|
}, 'aliases': {
|
||||||
'youtube-dl': ['all', '-multistreams', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext'],
|
'youtube-dl': ['all', '-multistreams', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'],
|
||||||
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext'],
|
'youtube-dlc': ['all', '-no-youtube-channel-redirect', '-no-live-chat', '-playlist-match-filter', '-manifest-filesize-approx', '-allow-unsafe-ext', '-prefer-vp9-sort'],
|
||||||
'2021': ['2022', 'no-certifi', 'filename-sanitization'],
|
'2021': ['2022', 'no-certifi', 'filename-sanitization'],
|
||||||
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
|
'2022': ['2023', 'no-external-downloader-progress', 'playlist-match-filter', 'prefer-legacy-http-handler', 'manifest-filesize-approx'],
|
||||||
'2023': [],
|
'2023': ['prefer-vp9-sort'],
|
||||||
},
|
},
|
||||||
}, help=(
|
}, help=(
|
||||||
'Options that can help keep compatibility with youtube-dl or youtube-dlc '
|
'Options that can help keep compatibility with youtube-dl or youtube-dlc '
|
||||||
|
|
|
@ -5278,6 +5278,7 @@ def make_archive_id(ie, video_id):
|
||||||
return f'{ie_key.lower()} {video_id}'
|
return f'{ie_key.lower()} {video_id}'
|
||||||
|
|
||||||
|
|
||||||
|
@partial_application
|
||||||
def truncate_string(s, left, right=0):
|
def truncate_string(s, left, right=0):
|
||||||
assert left > 3 and right >= 0
|
assert left > 3 and right >= 0
|
||||||
if s is None or len(s) <= left + right:
|
if s is None or len(s) <= left + right:
|
||||||
|
@ -5320,8 +5321,11 @@ class FormatSorter:
|
||||||
regex = r' *((?P<reverse>\+)?(?P<field>[a-zA-Z0-9_]+)((?P<separator>[~:])(?P<limit>.*?))?)? *$'
|
regex = r' *((?P<reverse>\+)?(?P<field>[a-zA-Z0-9_]+)((?P<separator>[~:])(?P<limit>.*?))?)? *$'
|
||||||
|
|
||||||
default = ('hidden', 'aud_or_vid', 'hasvid', 'ie_pref', 'lang', 'quality',
|
default = ('hidden', 'aud_or_vid', 'hasvid', 'ie_pref', 'lang', 'quality',
|
||||||
'res', 'fps', 'hdr:12', 'vcodec:vp9.2', 'channels', 'acodec',
|
'res', 'fps', 'hdr:12', 'vcodec', 'channels', 'acodec',
|
||||||
'size', 'br', 'asr', 'proto', 'ext', 'hasaud', 'source', 'id') # These must not be aliases
|
'size', 'br', 'asr', 'proto', 'ext', 'hasaud', 'source', 'id') # These must not be aliases
|
||||||
|
_prefer_vp9_sort = ('hidden', 'aud_or_vid', 'hasvid', 'ie_pref', 'lang', 'quality',
|
||||||
|
'res', 'fps', 'hdr:12', 'vcodec:vp9.2', 'channels', 'acodec',
|
||||||
|
'size', 'br', 'asr', 'proto', 'ext', 'hasaud', 'source', 'id')
|
||||||
ytdl_default = ('hasaud', 'lang', 'quality', 'tbr', 'filesize', 'vbr',
|
ytdl_default = ('hasaud', 'lang', 'quality', 'tbr', 'filesize', 'vbr',
|
||||||
'height', 'width', 'proto', 'vext', 'abr', 'aext',
|
'height', 'width', 'proto', 'vext', 'abr', 'aext',
|
||||||
'fps', 'fs_approx', 'source', 'id')
|
'fps', 'fs_approx', 'source', 'id')
|
||||||
|
|
|
@ -20,6 +20,7 @@ from ._utils import (
|
||||||
get_elements_html_by_class,
|
get_elements_html_by_class,
|
||||||
get_elements_html_by_attribute,
|
get_elements_html_by_attribute,
|
||||||
get_elements_by_attribute,
|
get_elements_by_attribute,
|
||||||
|
get_element_by_class,
|
||||||
get_element_html_by_attribute,
|
get_element_html_by_attribute,
|
||||||
get_element_by_attribute,
|
get_element_by_attribute,
|
||||||
get_element_html_by_id,
|
get_element_html_by_id,
|
||||||
|
@ -373,7 +374,7 @@ def subs_list_to_dict(subs: list[dict] | None = None, /, *, ext=None):
|
||||||
|
|
||||||
|
|
||||||
@typing.overload
|
@typing.overload
|
||||||
def find_element(*, attr: str, value: str, tag: str | None = None, html=False): ...
|
def find_element(*, attr: str, value: str, tag: str | None = None, html=False, regex=False): ...
|
||||||
|
|
||||||
|
|
||||||
@typing.overload
|
@typing.overload
|
||||||
|
@ -381,14 +382,14 @@ def find_element(*, cls: str, html=False): ...
|
||||||
|
|
||||||
|
|
||||||
@typing.overload
|
@typing.overload
|
||||||
def find_element(*, id: str, tag: str | None = None, html=False): ...
|
def find_element(*, id: str, tag: str | None = None, html=False, regex=False): ...
|
||||||
|
|
||||||
|
|
||||||
@typing.overload
|
@typing.overload
|
||||||
def find_element(*, tag: str, html=False): ...
|
def find_element(*, tag: str, html=False, regex=False): ...
|
||||||
|
|
||||||
|
|
||||||
def find_element(*, tag=None, id=None, cls=None, attr=None, value=None, html=False):
|
def find_element(*, tag=None, id=None, cls=None, attr=None, value=None, html=False, regex=False):
|
||||||
# deliberately using `id=` and `cls=` for ease of readability
|
# deliberately using `id=` and `cls=` for ease of readability
|
||||||
assert tag or id or cls or (attr and value), 'One of tag, id, cls or (attr AND value) is required'
|
assert tag or id or cls or (attr and value), 'One of tag, id, cls or (attr AND value) is required'
|
||||||
ANY_TAG = r'[\w:.-]+'
|
ANY_TAG = r'[\w:.-]+'
|
||||||
|
@ -397,17 +398,18 @@ def find_element(*, tag=None, id=None, cls=None, attr=None, value=None, html=Fal
|
||||||
assert not cls, 'Cannot match both attr and cls'
|
assert not cls, 'Cannot match both attr and cls'
|
||||||
assert not id, 'Cannot match both attr and id'
|
assert not id, 'Cannot match both attr and id'
|
||||||
func = get_element_html_by_attribute if html else get_element_by_attribute
|
func = get_element_html_by_attribute if html else get_element_by_attribute
|
||||||
return functools.partial(func, attr, value, tag=tag or ANY_TAG)
|
return functools.partial(func, attr, value, tag=tag or ANY_TAG, escape_value=not regex)
|
||||||
|
|
||||||
elif cls:
|
elif cls:
|
||||||
assert not id, 'Cannot match both cls and id'
|
assert not id, 'Cannot match both cls and id'
|
||||||
assert tag is None, 'Cannot match both cls and tag'
|
assert tag is None, 'Cannot match both cls and tag'
|
||||||
func = get_element_html_by_class if html else get_elements_by_class
|
assert not regex, 'Cannot use regex with cls'
|
||||||
|
func = get_element_html_by_class if html else get_element_by_class
|
||||||
return functools.partial(func, cls)
|
return functools.partial(func, cls)
|
||||||
|
|
||||||
elif id:
|
elif id:
|
||||||
func = get_element_html_by_id if html else get_element_by_id
|
func = get_element_html_by_id if html else get_element_by_id
|
||||||
return functools.partial(func, id, tag=tag or ANY_TAG)
|
return functools.partial(func, id, tag=tag or ANY_TAG, escape_value=not regex)
|
||||||
|
|
||||||
index = int(bool(html))
|
index = int(bool(html))
|
||||||
return lambda html: get_element_text_and_html_by_tag(tag, html)[index]
|
return lambda html: get_element_text_and_html_by_tag(tag, html)[index]
|
||||||
|
@ -418,19 +420,20 @@ def find_elements(*, cls: str, html=False): ...
|
||||||
|
|
||||||
|
|
||||||
@typing.overload
|
@typing.overload
|
||||||
def find_elements(*, attr: str, value: str, tag: str | None = None, html=False): ...
|
def find_elements(*, attr: str, value: str, tag: str | None = None, html=False, regex=False): ...
|
||||||
|
|
||||||
|
|
||||||
def find_elements(*, tag=None, cls=None, attr=None, value=None, html=False):
|
def find_elements(*, tag=None, cls=None, attr=None, value=None, html=False, regex=False):
|
||||||
# deliberately using `cls=` for ease of readability
|
# deliberately using `cls=` for ease of readability
|
||||||
assert cls or (attr and value), 'One of cls or (attr AND value) is required'
|
assert cls or (attr and value), 'One of cls or (attr AND value) is required'
|
||||||
|
|
||||||
if attr and value:
|
if attr and value:
|
||||||
assert not cls, 'Cannot match both attr and cls'
|
assert not cls, 'Cannot match both attr and cls'
|
||||||
func = get_elements_html_by_attribute if html else get_elements_by_attribute
|
func = get_elements_html_by_attribute if html else get_elements_by_attribute
|
||||||
return functools.partial(func, attr, value, tag=tag or r'[\w:.-]+')
|
return functools.partial(func, attr, value, tag=tag or r'[\w:.-]+', escape_value=not regex)
|
||||||
|
|
||||||
assert not tag, 'Cannot match both cls and tag'
|
assert not tag, 'Cannot match both cls and tag'
|
||||||
|
assert not regex, 'Cannot use regex with cls'
|
||||||
func = get_elements_html_by_class if html else get_elements_by_class
|
func = get_elements_html_by_class if html else get_elements_by_class
|
||||||
return functools.partial(func, cls)
|
return functools.partial(func, cls)
|
||||||
|
|
||||||
|
@ -449,6 +452,14 @@ def trim_str(*, start=None, end=None):
|
||||||
return trim
|
return trim
|
||||||
|
|
||||||
|
|
||||||
|
def unpack(func):
|
||||||
|
@functools.wraps(func)
|
||||||
|
def inner(items, **kwargs):
|
||||||
|
return func(*items, **kwargs)
|
||||||
|
|
||||||
|
return inner
|
||||||
|
|
||||||
|
|
||||||
def get_first(obj, *paths, **kwargs):
|
def get_first(obj, *paths, **kwargs):
|
||||||
return traverse_obj(obj, *((..., *variadic(keys)) for keys in paths), **kwargs, get_all=False)
|
return traverse_obj(obj, *((..., *variadic(keys)) for keys in paths), **kwargs, get_all=False)
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue
Block a user