Merge dda6f7b563 into f919729538

Release 2024.11.18
Created by: bashonly :ci skip all
2024-11-23 15:51:24 +01:00 · 2024-11-18 15:31:39 +05:30 · 2024-11-18 05:45:05 +00:00 · 2024-11-18 05:36:38 +00:00 · 2024-11-18 05:16:17 +00:00 · 2024-10-15 16:35:28 +01:00
8 changed files with 356 additions and 122 deletions
--- a/12
+++ b/12
@ -695,3 +695,15 @@ KBelmin
 kesor
 MellowKyler
 Wesley107772
+a13ssandr0
+ChocoLZS
+doe1080
+hugovdev
+jshumphrey
+julionc
+manavchaudhary1
+powergold1
+Sakura286
+SamDecrock
+stratus-ss
+subrat-lima
--- a/Changelog.md
+++ b/Changelog.md
@ -4,6 +4,64 @@
 # To create a release, dispatch the https://github.com/yt-dlp/yt-dlp/actions/workflows/release.yml workflow on master
 -->

+### 2024.11.18
+
+#### Important changes
+- **Login with OAuth is no longer supported for YouTube**
+Due to a change made by the site, yt-dlp is longer able to support OAuth login for YouTube. [Read more](https://github.com/yt-dlp/yt-dlp/issues/11462#issuecomment-2471703090)
+
+#### Core changes
+- [Catch broken Cryptodome installations](https://github.com/yt-dlp/yt-dlp/commit/b83ca24eb72e1e558b0185bd73975586c0bc0546) ([#11486](https://github.com/yt-dlp/yt-dlp/issues/11486)) by [seproDev](https://github.com/seproDev)
+- **utils**
+    - [Fix `join_nonempty`, add `**kwargs` to `unpack`](https://github.com/yt-dlp/yt-dlp/commit/39d79c9b9cf23411d935910685c40aa1a2fdb409) ([#11559](https://github.com/yt-dlp/yt-dlp/issues/11559)) by [Grub4K](https://github.com/Grub4K)
+    - `subs_list_to_dict`: [Add `lang` default parameter](https://github.com/yt-dlp/yt-dlp/commit/c014fbcddcb4c8f79d914ac5bb526758b540ea33) ([#11508](https://github.com/yt-dlp/yt-dlp/issues/11508)) by [Grub4K](https://github.com/Grub4K)
+
+#### Extractor changes
+- [Allow `ext` override for thumbnails](https://github.com/yt-dlp/yt-dlp/commit/eb64ae7d5def6df2aba74fb703e7f168fb299865) ([#11545](https://github.com/yt-dlp/yt-dlp/issues/11545)) by [bashonly](https://github.com/bashonly)
+- **adobepass**: [Fix provider requests](https://github.com/yt-dlp/yt-dlp/commit/85fdc66b6e01d19a94b4f39b58e3c0cf23600902) ([#11472](https://github.com/yt-dlp/yt-dlp/issues/11472)) by [bashonly](https://github.com/bashonly)
+- **archive.org**: [Fix comments extraction](https://github.com/yt-dlp/yt-dlp/commit/f2a4983df7a64c4e93b56f79dbd16a781bd90206) ([#11527](https://github.com/yt-dlp/yt-dlp/issues/11527)) by [jshumphrey](https://github.com/jshumphrey)
+- **bandlab**: [Add extractors](https://github.com/yt-dlp/yt-dlp/commit/6365e92589e4bc17b8fffb0125a716d144ad2137) ([#11535](https://github.com/yt-dlp/yt-dlp/issues/11535)) by [seproDev](https://github.com/seproDev)
+- **chaturbate**
+    - [Extract from API and support impersonation](https://github.com/yt-dlp/yt-dlp/commit/720b3dc453c342bc2e8df7dbc0acaab4479de46c) ([#11555](https://github.com/yt-dlp/yt-dlp/issues/11555)) by [powergold1](https://github.com/powergold1) (With fixes in [7cecd29](https://github.com/yt-dlp/yt-dlp/commit/7cecd299e4a5ef1f0f044b2fedc26f17e41f15e3) by [seproDev](https://github.com/seproDev))
+    - [Support alternate domains](https://github.com/yt-dlp/yt-dlp/commit/a9f85670d03ab993dc589f21a9ffffcad61392d5) ([#10595](https://github.com/yt-dlp/yt-dlp/issues/10595)) by [manavchaudhary1](https://github.com/manavchaudhary1)
+- **cloudflarestream**: [Avoid extraction via videodelivery.net](https://github.com/yt-dlp/yt-dlp/commit/2db8c2e7d57a1784b06057c48e3e91023720d195) ([#11478](https://github.com/yt-dlp/yt-dlp/issues/11478)) by [hugovdev](https://github.com/hugovdev)
+- **ctvnews**
+    - [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/f351440f1dc5b3dfbfc5737b037a869d946056fe) ([#11534](https://github.com/yt-dlp/yt-dlp/issues/11534)) by [bashonly](https://github.com/bashonly), [jshumphrey](https://github.com/jshumphrey)
+    - [Fix playlist ID extraction](https://github.com/yt-dlp/yt-dlp/commit/f9d98509a898737c12977b2e2117277bada2c196) ([#8892](https://github.com/yt-dlp/yt-dlp/issues/8892)) by [qbnu](https://github.com/qbnu)
+- **digitalconcerthall**: [Support login with access/refresh tokens](https://github.com/yt-dlp/yt-dlp/commit/f7257588bdff5f0b0452635a66b253a783c97357) ([#11571](https://github.com/yt-dlp/yt-dlp/issues/11571)) by [bashonly](https://github.com/bashonly)
+- **facebook**: [Fix formats extraction](https://github.com/yt-dlp/yt-dlp/commit/bacc31b05a04181b63100c481565256b14813a5e) ([#11513](https://github.com/yt-dlp/yt-dlp/issues/11513)) by [bashonly](https://github.com/bashonly)
+- **gamedevtv**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/be3579aaf0c3b71a0a3195e1955415d5e4d6b3d8) ([#11368](https://github.com/yt-dlp/yt-dlp/issues/11368)) by [bashonly](https://github.com/bashonly), [stratus-ss](https://github.com/stratus-ss)
+- **goplay**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/6b43a8d84b881d769b480ba6e20ec691e9d1b92d) ([#11466](https://github.com/yt-dlp/yt-dlp/issues/11466)) by [bashonly](https://github.com/bashonly), [SamDecrock](https://github.com/SamDecrock)
+- **kenh14**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/eb15fd5a32d8b35ef515f7a3d1158c03025648ff) ([#3996](https://github.com/yt-dlp/yt-dlp/issues/3996)) by [krichbanana](https://github.com/krichbanana), [pzhlkj6612](https://github.com/pzhlkj6612)
+- **litv**: [Fix extractor](https://github.com/yt-dlp/yt-dlp/commit/e079ffbda66de150c0a9ebef05e89f61bb4d5f76) ([#11071](https://github.com/yt-dlp/yt-dlp/issues/11071)) by [jiru](https://github.com/jiru)
+- **mixchmovie**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/0ec9bfed4d4a52bfb4f8733da1acf0aeeae21e6b) ([#10897](https://github.com/yt-dlp/yt-dlp/issues/10897)) by [Sakura286](https://github.com/Sakura286)
+- **patreon**: [Fix comments extraction](https://github.com/yt-dlp/yt-dlp/commit/1d253b0a27110d174c40faf8fb1c999d099e0cde) ([#11530](https://github.com/yt-dlp/yt-dlp/issues/11530)) by [bashonly](https://github.com/bashonly), [jshumphrey](https://github.com/jshumphrey)
+- **pialive**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/d867f99622ef7fba690b08da56c39d739b822bb7) ([#10811](https://github.com/yt-dlp/yt-dlp/issues/10811)) by [ChocoLZS](https://github.com/ChocoLZS)
+- **radioradicale**: [Add extractor](https://github.com/yt-dlp/yt-dlp/commit/70c55cb08f780eab687e881ef42bb5c6007d290b) ([#5607](https://github.com/yt-dlp/yt-dlp/issues/5607)) by [a13ssandr0](https://github.com/a13ssandr0), [pzhlkj6612](https://github.com/pzhlkj6612)
+- **reddit**: [Improve error handling](https://github.com/yt-dlp/yt-dlp/commit/7ea2787920cccc6b8ea30791993d114fbd564434) ([#11573](https://github.com/yt-dlp/yt-dlp/issues/11573)) by [bashonly](https://github.com/bashonly)
+- **redgifsuser**: [Fix extraction](https://github.com/yt-dlp/yt-dlp/commit/d215fba7edb69d4fa665f43663756fd260b1489f) ([#11531](https://github.com/yt-dlp/yt-dlp/issues/11531)) by [jshumphrey](https://github.com/jshumphrey)
+- **rutube**: [Rework extractors](https://github.com/yt-dlp/yt-dlp/commit/e398217aae19bb25f91797bfbe8a3243698d7f45) ([#11480](https://github.com/yt-dlp/yt-dlp/issues/11480)) by [seproDev](https://github.com/seproDev)
+- **sonylivseries**: [Add `sort_order` extractor-arg](https://github.com/yt-dlp/yt-dlp/commit/2009cb27e17014787bf63eaa2ada51293d54f22a) ([#11569](https://github.com/yt-dlp/yt-dlp/issues/11569)) by [bashonly](https://github.com/bashonly)
+- **soop**: [Fix thumbnail extraction](https://github.com/yt-dlp/yt-dlp/commit/c699bafc5038b59c9afe8c2e69175fb66424c832) ([#11545](https://github.com/yt-dlp/yt-dlp/issues/11545)) by [bashonly](https://github.com/bashonly)
+- **spankbang**: [Support browser impersonation](https://github.com/yt-dlp/yt-dlp/commit/8388ec256f7753b02488788e3cfa771f6e1db247) ([#11542](https://github.com/yt-dlp/yt-dlp/issues/11542)) by [jshumphrey](https://github.com/jshumphrey)
+- **spreaker**
+    - [Support episode pages and access keys](https://github.com/yt-dlp/yt-dlp/commit/c39016f66df76d14284c705736ca73db8055d8de) ([#11489](https://github.com/yt-dlp/yt-dlp/issues/11489)) by [julionc](https://github.com/julionc)
+    - [Support podcast and feed pages](https://github.com/yt-dlp/yt-dlp/commit/c6737310619022248f5d0fd13872073cac168453) ([#10968](https://github.com/yt-dlp/yt-dlp/issues/10968)) by [subrat-lima](https://github.com/subrat-lima)
+- **youtube**
+    - [Player client maintenance](https://github.com/yt-dlp/yt-dlp/commit/637d62a3a9fc723d68632c1af25c30acdadeeb85) ([#11528](https://github.com/yt-dlp/yt-dlp/issues/11528)) by [bashonly](https://github.com/bashonly), [seproDev](https://github.com/seproDev)
+    - [Remove broken OAuth support](https://github.com/yt-dlp/yt-dlp/commit/52c0ffe40ad6e8404d93296f575007b05b04c686) ([#11558](https://github.com/yt-dlp/yt-dlp/issues/11558)) by [bashonly](https://github.com/bashonly)
+    - tab: [Fix podcasts tab extraction](https://github.com/yt-dlp/yt-dlp/commit/37cd7660eaff397c551ee18d80507702342b0c2b) ([#11567](https://github.com/yt-dlp/yt-dlp/issues/11567)) by [seproDev](https://github.com/seproDev)
+
+#### Misc. changes
+- **build**
+    - [Bump PyInstaller version pin to `>=6.11.1`](https://github.com/yt-dlp/yt-dlp/commit/f9c8deb4e5887ff5150e911ac0452e645f988044) ([#11507](https://github.com/yt-dlp/yt-dlp/issues/11507)) by [bashonly](https://github.com/bashonly)
+    - [Enable attestations for trusted publishing](https://github.com/yt-dlp/yt-dlp/commit/f13df591d4d7ca8e2f31b35c9c91e69ba9e9b013) ([#11420](https://github.com/yt-dlp/yt-dlp/issues/11420)) by [bashonly](https://github.com/bashonly)
+    - [Pin `websockets` version to >=13.0,<14](https://github.com/yt-dlp/yt-dlp/commit/240a7d43c8a67ffb86d44dc276805aa43c358dcc) ([#11488](https://github.com/yt-dlp/yt-dlp/issues/11488)) by [bashonly](https://github.com/bashonly)
+- **cleanup**
+    - [Deprecate more compat functions](https://github.com/yt-dlp/yt-dlp/commit/f95a92b3d0169a784ee15a138fbe09d82b2754a1) ([#11439](https://github.com/yt-dlp/yt-dlp/issues/11439)) by [seproDev](https://github.com/seproDev)
+    - [Remove dead extractors](https://github.com/yt-dlp/yt-dlp/commit/10fc719bc7f1eef469389c5219102266ef411f29) ([#11566](https://github.com/yt-dlp/yt-dlp/issues/11566)) by [doe1080](https://github.com/doe1080)
+    - Miscellaneous: [da252d9](https://github.com/yt-dlp/yt-dlp/commit/da252d9d322af3e2178ac5eae324809502a0a862) by [bashonly](https://github.com/bashonly), [Grub4K](https://github.com/Grub4K), [seproDev](https://github.com/seproDev)
+
 ### 2024.11.04

 #### Important changes
--- a/README.md
+++ b/README.md
@ -1867,9 +1867,6 @@ The following extractors use this feature:
 #### bilibili
 * `prefer_multi_flv`: Prefer extracting flv formats over mp4 for older videos that still provide legacy formats

-#### digitalconcerthall
-* `prefer_combined_hls`: Prefer extracting combined/pre-merged video and audio HLS formats. This will exclude 4K/HEVC video and lossless/FLAC audio formats, which are only available as split video/audio HLS formats
-
 #### sonylivseries
 * `sort_order`: Episode sort order for series extraction - one of `asc` (ascending, oldest first) or `desc` (descending, newest first). Default is `asc`

--- a/supportedsites.md
+++ b/supportedsites.md
@ -129,6 +129,8 @@
 - **Bandcamp:album**
 - **Bandcamp:user**
 - **Bandcamp:weekly**
+ - **Bandlab**
+ - **BandlabPlaylist**
 - **BannedVideo**
 - **bbc**: [*bbc*](## "netrc machine") BBC
 - **bbc.co.uk**: [*bbc*](## "netrc machine") BBC iPlayer
@ -484,6 +486,7 @@
 - **Gab**
 - **GabTV**
 - **Gaia**: [*gaia*](## "netrc machine")
+ - **GameDevTVDashboard**: [*gamedevtv*](## "netrc machine")
 - **GameJolt**
 - **GameJoltCommunity**
 - **GameJoltGame**
@ -651,6 +654,8 @@
 - **Karaoketv**
 - **Katsomo**: (**Currently broken**)
 - **KelbyOne**: (**Currently broken**)
+ - **Kenh14Playlist**
+ - **Kenh14Video**
 - **Ketnet**
 - **khanacademy**
 - **khanacademy:unit**
@ -784,10 +789,6 @@
 - **MicrosoftLearnSession**
 - **MicrosoftMedius**
 - **microsoftstream**: Microsoft Stream
- - **mildom**: Record ongoing live by specific user in Mildom
- - **mildom:clip**: Clip in Mildom
- - **mildom:user:vod**: Download all VODs from specific user in Mildom
- - **mildom:vod**: VOD in Mildom
 - **minds**
 - **minds:channel**
 - **minds:group**
@ -798,6 +799,7 @@
 - **MiTele**: mitele.es
 - **mixch**
 - **mixch:archive**
+ - **mixch:movie**
 - **mixcloud**
 - **mixcloud:playlist**
 - **mixcloud:user**
@ -1060,8 +1062,8 @@
 - **PhilharmonieDeParis**: Philharmonie de Paris
 - **phoenix.de**
 - **Photobucket**
+ - **PiaLive**
 - **Piapro**: [*piapro*](## "netrc machine")
- - **PIAULIZAPortal**: ulizaportal.jp - PIA LIVE STREAM
 - **Picarto**
 - **PicartoVod**
 - **Piksel**
@ -1088,8 +1090,6 @@
 - **PodbayFMChannel**
 - **Podchaser**
 - **podomatic**: (**Currently broken**)
- - **Pokemon**
- - **PokemonWatch**
 - **PokerGo**: [*pokergo*](## "netrc machine")
 - **PokerGoCollection**: [*pokergo*](## "netrc machine")
 - **PolsatGo**
@ -1160,6 +1160,7 @@
 - **RadioJavan**: (**Currently broken**)
 - **radiokapital**
 - **radiokapital:show**
+ - **RadioRadicale**
 - **RadioZetPodcast**
 - **radlive**
 - **radlive:channel**
@ -1367,9 +1368,7 @@
 - **spotify**: Spotify episodes (**Currently broken**)
 - **spotify:show**: Spotify shows (**Currently broken**)
 - **Spreaker**
- - **SpreakerPage**
 - **SpreakerShow**
- - **SpreakerShowPage**
 - **SpringboardPlatform**
 - **Sprout**
 - **SproutVideo**
@ -1570,6 +1569,8 @@
 - **UFCTV**: [*ufctv*](## "netrc machine")
 - **ukcolumn**: (**Currently broken**)
 - **UKTVPlay**
+ - **UlizaPlayer**
+ - **UlizaPortal**: ulizaportal.jp
 - **umg:de**: Universal Music Deutschland (**Currently broken**)
 - **Unistra**
 - **Unity**: (**Currently broken**)
@ -1587,8 +1588,6 @@
 - **Varzesh3**: (**Currently broken**)
 - **Vbox7**
 - **Veo**
- - **Veoh**
- - **veoh:user**
 - **Vesti**: Вести.Ru (**Currently broken**)
 - **Vevo**
 - **VevoPlaylist**
--- a/yt_dlp/extractor/digitalconcerthall.py
+++ b/yt_dlp/extractor/digitalconcerthall.py
@ -1,7 +1,10 @@
+import time
+
 from .common import InfoExtractor
 from ..networking.exceptions import HTTPError
 from ..utils import (
    ExtractorError,
+    jwt_decode_hs256,
    parse_codecs,
    try_get,
    url_or_none,
@ -13,9 +16,6 @@ from ..utils.traversal import traverse_obj
 class DigitalConcertHallIE(InfoExtractor):
    IE_DESC = 'DigitalConcertHall extractor'
    _VALID_URL = r'https?://(?:www\.)?digitalconcerthall\.com/(?P<language>[a-z]+)/(?P<type>film|concert|work)/(?P<id>[0-9]+)-?(?P<part>[0-9]+)?'
-    _OAUTH_URL = 'https://api.digitalconcerthall.com/v2/oauth2/token'
-    _USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15'
-    _ACCESS_TOKEN = None
    _NETRC_MACHINE = 'digitalconcerthall'
    _TESTS = [{
        'note': 'Playlist with only one video',
@ -69,59 +69,157 @@ class DigitalConcertHallIE(InfoExtractor):
        'params': {'skip_download': 'm3u8'},
        'playlist_count': 1,
    }]
+    _LOGIN_HINT = ('Use  --username token --password ACCESS_TOKEN  where ACCESS_TOKEN '
+                   'is the "access_token_production" from your browser local storage')
+    _REFRESH_HINT = 'or else use a "refresh_token" with  --username refresh --password REFRESH_TOKEN'
+    _OAUTH_URL = 'https://api.digitalconcerthall.com/v2/oauth2/token'
+    _CLIENT_ID = 'dch.webapp'
+    _CLIENT_SECRET = '2ySLN+2Fwb'
+    _USER_AGENT = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.5 Safari/605.1.15'
+    _OAUTH_HEADERS = {
+        'Accept': 'application/json',
+        'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
+        'Origin': 'https://www.digitalconcerthall.com',
+        'Referer': 'https://www.digitalconcerthall.com/',
+        'User-Agent': _USER_AGENT,
+    }
+    _access_token = None
+    _access_token_expiry = 0
+    _refresh_token = None

-    def _perform_login(self, username, password):
-        login_token = self._download_json(
-            self._OAUTH_URL,
-            None, 'Obtaining token', errnote='Unable to obtain token', data=urlencode_postdata({
+    @property
+    def _access_token_is_expired(self):
+        return self._access_token_expiry - 30 <= int(time.time())
+
+    def _set_access_token(self, value):
+        self._access_token = value
+        self._access_token_expiry = traverse_obj(value, ({jwt_decode_hs256}, 'exp', {int})) or 0
+
+    def _cache_tokens(self, /):
+        self.cache.store(self._NETRC_MACHINE, 'tokens', {
+            'access_token': self._access_token,
+            'refresh_token': self._refresh_token,
+        })
+
+    def _fetch_new_tokens(self, invalidate=False):
+        if invalidate:
+            self.report_warning('Access token has been invalidated')
+            self._set_access_token(None)
+
+        if not self._access_token_is_expired:
+            return
+
+        if not self._refresh_token:
+            self._set_access_token(None)
+            self._cache_tokens()
+            raise ExtractorError(
+                'Access token has expired or been invalidated. '
+                'Get a new "access_token_production" value from your browser '
+                f'and try again, {self._REFRESH_HINT}', expected=True)
+
+        # If we only have a refresh token, we need a temporary "initial token" for the refresh flow
+        bearer_token = self._access_token or self._download_json(
+            self._OAUTH_URL, None, 'Obtaining initial token', 'Unable to obtain initial token',
+            data=urlencode_postdata({
                'affiliate': 'none',
                'grant_type': 'device',
                'device_vendor': 'unknown',
-                # device_model 'Safari' gets split streams of 4K/HEVC video and lossless/FLAC audio
-                'device_model': 'unknown' if self._configuration_arg('prefer_combined_hls') else 'Safari',
-                'app_id': 'dch.webapp',
+                # device_model 'Safari' gets split streams of 4K/HEVC video and lossless/FLAC audio,
+                # but this is no longer effective since actual login is not possible anymore
+                'device_model': 'unknown',
+                'app_id': self._CLIENT_ID,
                'app_distributor': 'berlinphil',
-                'app_version': '1.84.0',
-                'client_secret': '2ySLN+2Fwb',
-            }), headers={
-                'Accept': 'application/json',
-                'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
-                'User-Agent': self._USER_AGENT,
-            })['access_token']
+                'app_version': '1.95.0',
+                'client_secret': self._CLIENT_SECRET,
+            }), headers=self._OAUTH_HEADERS)['access_token']
+
        try:
-            login_response = self._download_json(
-                self._OAUTH_URL,
-                None, note='Logging in', errnote='Unable to login', data=urlencode_postdata({
-                    'grant_type': 'password',
-                    'username': username,
-                    'password': password,
+            response = self._download_json(
+                self._OAUTH_URL, None, 'Refreshing token', 'Unable to refresh token',
+                data=urlencode_postdata({
+                    'grant_type': 'refresh_token',
+                    'refresh_token': self._refresh_token,
+                    'client_id': self._CLIENT_ID,
+                    'client_secret': self._CLIENT_SECRET,
                }), headers={
-                    'Accept': 'application/json',
-                    'Content-Type': 'application/x-www-form-urlencoded;charset=UTF-8',
-                    'Referer': 'https://www.digitalconcerthall.com',
-                    'Authorization': f'Bearer {login_token}',
-                    'User-Agent': self._USER_AGENT,
+                    **self._OAUTH_HEADERS,
+                    'Authorization': f'Bearer {bearer_token}',
                })
-        except ExtractorError as error:
-            if isinstance(error.cause, HTTPError) and error.cause.status == 401:
-                raise ExtractorError('Invalid username or password', expected=True)
+        except ExtractorError as e:
+            if isinstance(e.cause, HTTPError) and e.cause.status == 401:
+                self._set_access_token(None)
+                self._refresh_token = None
+                self._cache_tokens()
+                raise ExtractorError('Your tokens have been invalidated', expected=True)
            raise
-        self._ACCESS_TOKEN = login_response['access_token']
+
+        self._set_access_token(response['access_token'])
+        if refresh_token := traverse_obj(response, ('refresh_token', {str})):
+            self.write_debug('New refresh token granted')
+            self._refresh_token = refresh_token
+        self._cache_tokens()
+
+    def _perform_login(self, username, password):
+        self.report_login()
+
+        if username == 'refresh':
+            self._refresh_token = password
+            self._fetch_new_tokens()
+
+        if username == 'token':
+            if not traverse_obj(password, {jwt_decode_hs256}):
+                raise ExtractorError(
+                    f'The access token passed to yt-dlp is not valid. {self._LOGIN_HINT}', expected=True)
+            self._set_access_token(password)
+            self._cache_tokens()
+
+        if username in ('refresh', 'token'):
+            if self.get_param('cachedir') is not False:
+                token_type = 'access' if username == 'token' else 'refresh'
+                self.to_screen(f'Your {token_type} token has been cached to disk. To use the cached '
+                               'token next time, pass  --username cache  along with any password')
+            return
+
+        if username != 'cache':
+            raise ExtractorError(
+                'Login with username and password is no longer supported '
+                f'for this site. {self._LOGIN_HINT}, {self._REFRESH_HINT}', expected=True)
+
+        # Try cached access_token
+        cached_tokens = self.cache.load(self._NETRC_MACHINE, 'tokens', default={})
+        self._set_access_token(cached_tokens.get('access_token'))
+        self._refresh_token = cached_tokens.get('refresh_token')
+        if not self._access_token_is_expired:
+            return
+
+        # Try cached refresh_token
+        self._fetch_new_tokens(invalidate=True)

    def _real_initialize(self):
-        if not self._ACCESS_TOKEN:
-            self.raise_login_required(method='password')
+        if not self._access_token:
+            self.raise_login_required(
+                'All content on this site is only available for registered users. '
+                f'{self._LOGIN_HINT}, {self._REFRESH_HINT}', method=None)

    def _entries(self, items, language, type_, **kwargs):
        for item in items:
            video_id = item['id']
-            stream_info = self._download_json(
-                self._proto_relative_url(item['_links']['streams']['href']), video_id, headers={
-                    'Accept': 'application/json',
-                    'Authorization': f'Bearer {self._ACCESS_TOKEN}',
-                    'Accept-Language': language,
-                    'User-Agent': self._USER_AGENT,
-                })
+
+            for should_retry in (True, False):
+                self._fetch_new_tokens(invalidate=not should_retry)
+                try:
+                    stream_info = self._download_json(
+                        self._proto_relative_url(item['_links']['streams']['href']), video_id, headers={
+                            'Accept': 'application/json',
+                            'Authorization': f'Bearer {self._access_token}',
+                            'Accept-Language': language,
+                            'User-Agent': self._USER_AGENT,
+                        })
+                    break
+                except ExtractorError as error:
+                    if should_retry and isinstance(error.cause, HTTPError) and error.cause.status == 401:
+                        continue
+                    raise

            formats = []
            for m3u8_url in traverse_obj(stream_info, ('channel', ..., 'stream', ..., 'url', {url_or_none})):
@ -157,7 +255,6 @@ class DigitalConcertHallIE(InfoExtractor):
                'Accept': 'application/json',
                'Accept-Language': language,
                'User-Agent': self._USER_AGENT,
-                'Authorization': f'Bearer {self._ACCESS_TOKEN}',
            })
        videos = [vid_info] if type_ == 'film' else traverse_obj(vid_info, ('_embedded', ..., ...))

--- a/yt_dlp/extractor/radiofrance.py
+++ b/yt_dlp/extractor/radiofrance.py
@ -1,6 +1,4 @@
-import itertools
 import re
-import urllib.parse

 from .common import InfoExtractor
 from ..utils import (
@ -19,18 +17,6 @@ class RadioFranceIE(InfoExtractor):
    _VALID_URL = r'https?://maison\.radiofrance\.fr/radiovisions/(?P<id>[^?#]+)'
    IE_NAME = 'radiofrance'

-    _TEST = {
-        'url': 'http://maison.radiofrance.fr/radiovisions/one-one',
-        'md5': 'bdbb28ace95ed0e04faab32ba3160daf',
-        'info_dict': {
-            'id': 'one-one',
-            'ext': 'ogg',
-            'title': 'One to one',
-            'description': "Plutôt que d'imaginer la radio de demain comme technologie ou comme création de contenu, je veux montrer que quelles que soient ses évolutions, j'ai l'intime conviction que la radio continuera d'être un grand média de proximité pour les auditeurs.",
-            'uploader': 'Thomas Hercouët',
-        },
-    }
-
    def _real_extract(self, url):
        m = self._match_valid_url(url)
        video_id = m.group('id')
@ -237,7 +223,8 @@ class RadioFranceLiveIE(RadioFranceBaseIE):

        if substation_id:
            webpage = self._download_webpage(url, station_id)
-            api_response = self._extract_data_from_webpage(webpage, station_id, 'webRadioData')
+            api_response = self._search_json(r'webradioLive:\s*', webpage, station_id, substation_id,
+                                             transform_source=js_to_json)
        else:
            api_response = self._download_json(
                f'https://www.radiofrance.fr/{station_id}/api/live', station_id)
@ -267,42 +254,66 @@ class RadioFranceLiveIE(RadioFranceBaseIE):
 class RadioFrancePlaylistBaseIE(RadioFranceBaseIE):
    """Subclasses must set _METADATA_KEY"""

-    def _call_api(self, content_id, cursor, page_num):
+    def _call_api(self, station, content_id, cursor):
        raise NotImplementedError('This method must be implemented by subclasses')

-    def _generate_playlist_entries(self, content_id, content_response):
-        for page_num in itertools.count(2):
+    def _generate_playlist_entries(self, station, content_id, content_response):
+        while True:
            for entry in content_response['items']:
-                yield self.url_result(
-                    f'https://www.radiofrance.fr/{entry["path"]}', url_transparent=True, **traverse_obj(entry, {
-                        'title': 'title',
-                        'description': 'standFirst',
-                        'timestamp': ('publishedDate', {int_or_none}),
-                        'thumbnail': ('visual', 'src'),
-                    }))
+                if entry['link'] == '':
+                    yield entry
+                else:
+                    yield self.url_result(
+                        f'https://www.radiofrance.fr{entry["link"]}', url_transparent=True, **traverse_obj(entry, {
+                            'title': 'title',
+                            'description': 'standFirst',
+                            'timestamp': ('publishedDate', {int_or_none}),
+                            'thumbnail': ('visual', 'src'),
+                        }))

-            next_cursor = traverse_obj(content_response, (('pagination', None), 'next'), get_all=False)
-            if not next_cursor:
+            if content_response['next']:
+                content_response = self._call_api(station, content_id, content_response['next'])
+            else:
                break

-            content_response = self._call_api(content_id, next_cursor, page_num)
+    def _extract_embedded_episodes(self, item, webpage, content_id):
+        """Certain episdoes data are embedded directly in the page, use these if the link is missing"""
+        links = item['playerInfo']['media']['sources']
+        item['formats'] = []
+        for linkkey in links:
+            url = self._search_regex(linkkey + r'\.url="([^"]+)";', webpage, content_id)
+            dur = int(self._search_regex(linkkey + r'\.duration=(\d+);', webpage, content_id))
+            preset = self._search_json(linkkey + r'\.preset=', webpage, content_id, content_id, contains_pattern=r'\{.+\}', transform_source=js_to_json)
+            item['formats'].append({
+                'format_id': preset['id'],
+                'url': url,
+                'vcodec': 'none',
+                'acodec': preset['encoding'],
+                'quality': preset['bitrate'],
+                'duration': dur,
+            })
+            item['duration'] = dur
+        return item

    def _real_extract(self, url):
-        display_id = self._match_id(url)
+        playlist_id = self._match_id(url)
+        # If it is a podcast playlist, get the name of the station it is on
+        # profile page playlists are not attached to a station currently
+        station = self._match_valid_url(url).group('station') if isinstance(self, RadioFrancePodcastIE) else None

-        metadata = self._download_json(
-            'https://www.radiofrance.fr/api/v2.1/path', display_id,
-            query={'value': urllib.parse.urlparse(url).path})['content']
-
-        content_id = metadata['id']
+        # Get data for the first page, and the uuid for the playlist
+        metadata = self._call_api(station, playlist_id, 1)
+        uuid = traverse_obj(metadata, ('metadata', 'id'))

        return self.playlist_result(
-            self._generate_playlist_entries(content_id, metadata[self._METADATA_KEY]), content_id,
-            display_id=display_id, **{**traverse_obj(metadata, {
+            self._generate_playlist_entries(station, playlist_id, metadata),
+            uuid,
+            display_id=playlist_id,
+            **{**traverse_obj(metadata['metadata'], {
                'title': 'title',
                'description': 'standFirst',
                'thumbnail': ('visual', 'src'),
-            }), **traverse_obj(metadata, {
+            }), **traverse_obj(metadata['metadata'], {
                'title': 'name',
                'description': 'role',
            })})
@ -311,7 +322,7 @@ class RadioFrancePlaylistBaseIE(RadioFranceBaseIE):
 class RadioFrancePodcastIE(RadioFrancePlaylistBaseIE):
    _VALID_URL = rf'''(?x)
        {RadioFranceBaseIE._VALID_URL_BASE}
-        /(?:{RadioFranceBaseIE._STATIONS_RE})
+        /(?P<station>{RadioFranceBaseIE._STATIONS_RE})
        /podcasts/(?P<id>[\w-]+)/?(?:[?#]|$)
    '''

@ -321,20 +332,20 @@ class RadioFrancePodcastIE(RadioFrancePlaylistBaseIE):
            'id': 'eaf6ef81-a980-4f1c-a7d1-8a75ecd54b17',
            'display_id': 'le-billet-vert',
            'title': 'Le billet sciences',
-            'description': 'md5:eb1007b34b0c0a680daaa71525bbd4c1',
+            'description': 'md5:85d5ce8c488192e71904c551d595f4da',
            'thumbnail': r're:^https?://.*\.(?:jpg|png)',
        },
        'playlist_mincount': 11,
    }, {
-        'url': 'https://www.radiofrance.fr/franceinter/podcasts/jean-marie-le-pen-l-obsession-nationale',
+        'url': 'https://www.radiofrance.fr/franceinter/podcasts/avec-la-langue',
        'info_dict': {
-            'id': '566fd524-3074-4fbc-ac69-8696f2152a54',
-            'display_id': 'jean-marie-le-pen-l-obsession-nationale',
-            'title': 'Jean-Marie Le Pen, l\'obsession nationale',
-            'description': 'md5:a07c0cfb894f6d07a62d0ad12c4b7d73',
+            'id': '53a95989-7c61-48c7-873c-6a71009101bb',
+            'display_id': 'avec-la-langue',
+            'title': 'Avec la langue',
+            'description': 'md5:4ddb6d4ed46dbbdee611b8e16e4af868',
            'thumbnail': r're:^https?://.*\.(?:jpg|png)',
        },
-        'playlist_count': 7,
+        'playlist_mincount': 36,
    }, {
        'url': 'https://www.radiofrance.fr/franceculture/podcasts/serie-thomas-grjebine',
        'info_dict': {
@ -349,10 +360,20 @@ class RadioFrancePodcastIE(RadioFrancePlaylistBaseIE):
            'id': '143dff38-e956-4a5d-8576-1c0b7242b99e',
            'display_id': 'certains-l-aiment-fip',
            'title': 'Certains l’aiment Fip',
-            'description': 'md5:ff974672ba00d4fd5be80fb001c5b27e',
+            'description': 'md5:7c373cdcec7a024f12fa34de7612e44e',
            'thumbnail': r're:^https?://.*\.(?:jpg|png)',
        },
        'playlist_mincount': 321,
+    }, {
+        'url': 'http://www.radiofrance.fr/franceculture/podcasts/serie-les-aventures-de-tintin-les-cigares-du-pharaon',
+        'info_dict': {
+            'id': '01b096c6-e7f8-49c4-8319-dd399221885b',
+            'display_id': 'serie-les-aventures-de-tintin-les-cigares-du-pharaon',
+            'title': 'Les Cigares du Pharaon\xa0: les Aventures de Tintin',
+            'description': 'md5:1c5b6d010b2aaeb0d90b2c233b5f7b15',
+            'thumbnail': r're:^https?://.*\.(?:jpg|png)',
+        },
+        'playlist_count': 5,
    }, {
        'url': 'https://www.radiofrance.fr/franceinter/podcasts/le-7-9',
        'only_matching': True,
@ -363,24 +384,48 @@ class RadioFrancePodcastIE(RadioFrancePlaylistBaseIE):

    _METADATA_KEY = 'expressions'

-    def _call_api(self, podcast_id, cursor, page_num):
-        return self._download_json(
-            f'https://www.radiofrance.fr/api/v2.1/concepts/{podcast_id}/expressions', podcast_id,
-            note=f'Downloading page {page_num}', query={'pageCursor': cursor})
+    def _call_api(self, station, podcast_id, cursor):
+        # The data is stored in the last <script> tag on a page
+        url = 'https://www.radiofrance.fr/' + station + '/podcasts/' + podcast_id + '?p=' + str(cursor)
+        webpage = self._download_webpage(url, podcast_id, note=f'Downloading {podcast_id} page {cursor}')
+
+        resp = {}
+        resp['items'] = []
+
+        # _search_json cannot parse the data as it contains javascript
+        # Therefore, parse the episodes objects array separately
+        itemlist = self._search_json(r'a.items\s*=\s*', webpage, podcast_id, podcast_id,
+                                     contains_pattern=r'\[.+\]', transform_source=js_to_json)
+
+        for item in itemlist:
+            if item['model'] == 'Expression':
+                if item['link'] == '':
+                    item = self._extract_embedded_episodes(item, webpage, podcast_id)
+                resp['items'].append(item)
+
+        # the pagination data is stored in a javascript object 'a'
+        lastPage = int(re.search(r'a\.lastPage\s*=\s*(\d+);', webpage).group(1))
+        hasMorePages = cursor < lastPage
+        resp['next'] = cursor + 1 if hasMorePages else None
+
+        resp['metadata'] = self._search_json(r'content:\s*', webpage, podcast_id, podcast_id,
+                                             transform_source=js_to_json)
+
+        return resp


 class RadioFranceProfileIE(RadioFrancePlaylistBaseIE):
    _VALID_URL = rf'{RadioFranceBaseIE._VALID_URL_BASE}/personnes/(?P<id>[\w-]+)'

    _TESTS = [{
-        'url': 'https://www.radiofrance.fr/personnes/thomas-pesquet?p=3',
+        'url': 'https://www.radiofrance.fr/personnes/thomas-pesquet',
        'info_dict': {
            'id': '86c62790-e481-11e2-9f7b-782bcb6744eb',
            'display_id': 'thomas-pesquet',
            'title': 'Thomas Pesquet',
            'description': 'Astronaute à l\'agence spatiale européenne',
        },
-        'playlist_mincount': 212,
+        'playlist_mincount': 100,
    }, {
        'url': 'https://www.radiofrance.fr/personnes/eugenie-bastie',
        'info_dict': {
@ -398,15 +443,39 @@ class RadioFranceProfileIE(RadioFrancePlaylistBaseIE):

    _METADATA_KEY = 'documents'

-    def _call_api(self, profile_id, cursor, page_num):
-        resp = self._download_json(
-            f'https://www.radiofrance.fr/api/v2.1/taxonomy/{profile_id}/documents', profile_id,
-            note=f'Downloading page {page_num}', query={
-                'relation': 'personality',
-                'cursor': cursor,
-            })
+    def _call_api(self, station, profile_id, cursor):
+        url = 'https://www.radiofrance.fr/personnes/' + profile_id + '?p=' + str(cursor)
+        webpage = self._download_webpage(url, profile_id, note=f'Downloading {profile_id} page {cursor}')
+
+        resp = {}
+        resp['items'] = []
+
+        # get episode data from page
+        pagedata = self._search_json(r'documents\s*:\s*', webpage, profile_id, profile_id,
+                                     transform_source=js_to_json)
+
+        # get the page data
+        pagekey = pagedata['pagination']
+        hasMorePages = False
+        lastPage = int(self._search_regex(pagekey + r'\.lastPage=(\d+);', webpage, profile_id, '0'))
+        hasMorePages = cursor < lastPage
+        resp['next'] = cursor + 1 if hasMorePages else None
+
+        # get episode data, note, not all will be A/V, so filter for 'expression'
+        for item in pagedata['items']:
+            if item['model'] == 'Expression':
+                if item.link == '':
+                    item = self._extract_embedded_episodes(item, webpage, profile_id)
+                resp['items'].append(item)
+
+        resp['metadata'] = self._search_json(r'content:\s*', webpage, profile_id, profile_id,
+                                             transform_source=js_to_json)
+        # If the image data is stored separately rather than in the main content area
+        if resp['metadata']['visual'] and isinstance(resp['metadata']['visual'], str):
+            imagedata = {}
+            imagedata['src'] = self._og_search_thumbnail(webpage)
+            resp['metadata']['visual'] = imagedata

-        resp['next'] = traverse_obj(resp, ('pagination', 'next'))
        return resp


@ -423,14 +492,14 @@ class RadioFranceProgramScheduleIE(RadioFranceBaseIE):
            'id': 'franceinter-program-20230217',
            'upload_date': '20230217',
        },
-        'playlist_count': 25,
+        'playlist_count': 27,
    }, {
        'url': 'https://www.radiofrance.fr/franceculture/grille-programmes?date=01-02-2023',
        'info_dict': {
            'id': 'franceculture-program-20230201',
            'upload_date': '20230201',
        },
-        'playlist_count': 25,
+        'playlist_count': 29,
    }, {
        'url': 'https://www.radiofrance.fr/mouv/grille-programmes?date=19-03-2023',
        'info_dict': {
@ -444,7 +513,7 @@ class RadioFranceProgramScheduleIE(RadioFranceBaseIE):
            'id': 'francemusique-program-20230318',
            'upload_date': '20230318',
        },
-        'playlist_count': 15,
+        'playlist_count': 16,
    }, {
        'url': 'https://www.radiofrance.fr/franceculture/grille-programmes',
        'only_matching': True,
--- a/yt_dlp/extractor/reddit.py
+++ b/yt_dlp/extractor/reddit.py
@ -259,6 +259,8 @@ class RedditIE(InfoExtractor):
                f'https://www.reddit.com/{slug}/.json', video_id, expected_status=403)
        except ExtractorError as e:
            if isinstance(e.cause, json.JSONDecodeError):
+                if self._get_cookies('https://www.reddit.com/').get('reddit_session'):
+                    raise ExtractorError('Your IP address is unable to access the Reddit API', expected=True)
                self.raise_login_required('Account authentication is required')
            raise

--- a/yt_dlp/version.py
+++ b/yt_dlp/version.py
@ -1,8 +1,8 @@
 # Autogenerated by devscripts/update-version.py

-__version__ = '2024.11.04'
+__version__ = '2024.11.18'

-RELEASE_GIT_HEAD = '197d0b03b6a3c8fe4fa5ace630eeffec629bf72c'
+RELEASE_GIT_HEAD = '7ea2787920cccc6b8ea30791993d114fbd564434'

 VARIANT = None

@ -12,4 +12,4 @@ CHANNEL = 'stable'

 ORIGIN = 'yt-dlp/yt-dlp'

-_pkg_version = '2024.11.04'
+_pkg_version = '2024.11.18'
Author	SHA1	Message	Date
Léon McGregor	3a877ef0af	Merge `dda6f7b563` into `f919729538`	2024-11-18 15:31:39 +05:30
github-actions[bot]	f919729538	Release 2024.11.18 Created by: bashonly :ci skip all	2024-11-18 05:45:05 +00:00
bashonly	7ea2787920	[ie/reddit] Improve error handling (#11573 ) Authored by: bashonly	2024-11-18 05:36:38 +00:00
bashonly	f7257588bd	[ie/digitalconcerthall] Support login with access/refresh tokens (#11571 ) Removes broken support for login with email and password Removes obsolete `prefer_combined_hls` extractor-arg Closes #11404, Closes #11436 Authored by: bashonly	2024-11-18 05:16:17 +00:00
lonm	dda6f7b563	[RadioFrance] run autopep	2024-10-15 16:35:28 +01:00
lonm	dcd0ee3ec3	[RadioFrance] ruff trailing commas	2024-10-15 16:30:19 +01:00
lonm	9e3ac89514	[RadioFrance] support pages with embedded playback info	2024-10-15 16:28:49 +01:00
lonm	0fb8bc11ed	[RadioFrance] Fix ruff issues	2024-10-15 15:04:48 +01:00
lonm	3c5e3af7bc	[RadioFrance] Remove defunct test	2024-10-15 14:54:09 +01:00
lonm	9d54ffc768	[RadioFrance] update tests for program grille	2024-10-15 14:52:11 +01:00
lonm	e01fab7041	[RadioFrance] fix profile pagination detection	2024-10-15 14:44:48 +01:00
lonm	867bf965bb	[RadioFrance] Fix playlist api parse	2024-10-15 14:23:47 +01:00
lonm	40f1a95a67	Merge branch 'master' of github.com:yt-dlp/yt-dlp	2024-10-15 13:07:59 +01:00
lonm	dd74aa0bca	[RadioFrance] Fix quote styling	2024-05-16 11:45:17 +01:00
lonm	e5e91ad05d	[RadioFrance] Fix thumb detection on profiles	2024-05-16 11:29:32 +01:00
lonm	7308dc895c	[RadioFrance] Fix outdated tests	2024-05-16 11:29:16 +01:00
lonm	1f719e1934	[RadioFrance] Cleanup imports	2024-05-16 11:00:08 +01:00
lonm	a8edca98f5	[RadioFrance] Fix live substations	2024-05-16 10:59:56 +01:00
lonm	827560f2b9	[RadioFrance] Ep selection is already handled, don't add it here	2024-05-16 10:47:28 +01:00
lonm	5db908bebf	Merge branch 'master' of github.com:LonMcGregor/yt-dlp	2024-05-15 16:41:43 +01:00
lonm	e2243c2033	[RadioFrance] Fix podcast and person playlist downloads	2024-05-15 16:41:26 +01:00
lonm	960b8931c6	Fix podcast and person playlist downloads	2024-05-15 16:39:56 +01:00