opera处理Accept-Encoding讨论

sidki config set的配置、使用,新手入门首选

Moderator: phoenix

Post Reply
red
Posts: 99
Joined: Mar 09 2010, 16:25

opera处理Accept-Encoding讨论

Post by red » Dec 21 2010, 00:03

http://www-31.ibm.com/storage/cn/disk/d ... pecs.shtml 用opera浏览页面空白,ie和ff浏览正常

ph提示是opera对Content-Encoding: deflate编码支持有问题导致。添加如下规则,用opera浏览页面正常

Code: Select all

[HTTP headers]
In = FALSE
Out = TRUE
Key = "Accept-Encoding: 3 Fix opera webpage encoding to gzip (out)"
URL = "$URL(http://www-31.ibm.com/)&$OHDR(User-Agent:*opera*)"
Match = "*"
Replace = "gzip"
可bypass情况下服务器返回 Content-Encoding: deflate,opera亦能正常显示。猜测是否也与sidki规则相关。ddbb也有提到sidki 2007 set下可以正常显示

Code: Select all

GET
Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0
Cache-Control: no-cache
TE: deflate, gzip, chunked, identity, trailers

RESP
Content-Encoding: deflate
未添加上面规则,服务器返回Content-Encoding: deflate,页面空白

Code: Select all

GET
Accept-Encoding: deflate, gzip, x-gzip

RESP
Content-Encoding: deflate
Cache-Control: max-age=1
Match 1060: Top All Mark: Start     04.07.11 (multi) [sd] (d.r)
Match 1060: Top All Mark: End     06.12.25 [sd] (d.r)
Match 1060: Top JS Mark: Start     10.10.09 (multi) [sd] (d.r)
Match 1060: Top JS: Mark End     10.10.09 [sd] (d.r)
Match 1060: Top HTML Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 1060: Top HTML Mark: End     07.10.24 [sd] (d.r)
+++CLOSE 1060+++
GET 1061 : If-None-Match killed due to IMS: "13e-419930c0"
BlockList 1061: in User-Agents, line 53
添加上面规则,服务器返回Content-Encoding: gzip,正常显示

Code: Select all

GET
Accept-Encoding: gzip

RESP
Content-Encoding: gzip
Cache-Control: max-age=1
Match 1063: Top All Mark: Start     04.07.11 (multi) [sd] (d.r)
Match 1063: Top All Mark: End     06.12.25 [sd] (d.r)
Match 1063: Top JS Mark: Start     10.10.09 (multi) [sd] (d.r)
Match 1063: Top JS: Mark End     10.10.09 [sd] (d.r)
Match 1063: Top HTML Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 1063: Top Sniff: HTML Content: HTML     10.10.16 (multi) [sd] (d.1)
Match 1063: Top HTML Mark: End     07.10.24 [sd] (d.r)
Match 1063: Header Top Mark: Start - Fix <head>     10.10.16 (multi) [sd] (d.r)
Match 1063: <meta> Block Cache Tags: Cache!/Fresh!     07.08.30 (cch!) [pr] (d.1)
Match 1063: <meta> Remove: PICS-Label - Show in Footer     09.05.14 [sd] (d.1 l.4)
Match 1063: Header Top Add: Initial JS Code     09.11.01 (ccw! !mos) [...] (d.r)
Match 1063: Header Top Mark: End     07.09.06 (multi) [sd] (d.r)
Match 1063: Manage: Sel. Tags     10.08.29 [vm] (d.0)
BlockList 1063: in AdPaths, line 216
BlockList 1063: in AdList, line 85
Match 1063: <script> Block: Scripts by URL     10.10.16 [pr] (d.2)
Match 1063: Header Bot Mark: Start - Fix </head>     09.06.29 (multi) [sd] (d.r)
Match 1063: Header Bot Add: Default Script/Style Type if Missing     07.08.31 [sd] (d.1)
Match 1063: Header Bot Add: Navigation Links     09.07.04 [sd] (d.1)
Match 1063: Header Bot Mark: End     07.11.02 [sd] (d.r)
Match 1063: <body> Mark: Start     09.06.20 (multi) [sd] (d.r)
Match 1063: Body Add: JS Code     09.06.13 (ccw! !nn !mos) [...] (d.r l.3)
Match 1063: <img>... Remove: Webbugs     09.05.28 [sd] (d.1)
BlockList 1063: in AdPaths-J, line 70
Match 1063: <script> Block: Scripts by URL     10.10.16 [pr] (d.2)
BlockList 1063: in AdPaths-J, line 230
Match 1063: <script> Block: Scripts by URL     10.10.16 [pr] (d.2)
Match 1063: <script><style> Remove: Comments     09.07.04 (multi) [jd sd] (d.r)
BlockList 1063: in AdKeys-J, line 244
Match 1063: <script> Block: Ad Scripts - Content     09.11.15 [pr sd jd] (d.2)
Match 1063: <html><body>: Mark First - Remove Dupes     09.06.28 (multi) [sd] (d.r)
Match 1063: HTML Bottom Mark: Start - Close open Tags     09.05.17 (multi) [sd] (d.r)
Match 1063: <html><body>: Mark First - Remove Dupes     09.06.28 (multi) [sd] (d.r)
Match 1063: HTML Bottom Mark: Start - Close open Tags     09.05.17 (multi) [sd] (d.r)
Match 1063: HTML Bottom Mark: Start - Close open Tags     09.05.17 (multi) [sd] (d.r)
Match 1063: Bottom Add: Display Site Specific Info     09.07.04 (!nn) [sd jd] (d.1 l.2)
Match 1063: Bottom Add: Final JS Code     09.06.13 (ccw! !mos) [...] (d.r)
Match 1063: Bottom Mark: End     09.05.08 [sd] (d.r)
Match 1063: <html><body>: Mark First - Remove Dupes     09.06.28 (multi) [sd] (d.r)
Match 1063: <html><body>: Mark First - Remove Dupes     09.06.28 (multi) [sd] (d.r)
+++CLOSE 1063+++
GET 1067 : If-None-Match killed due to IMS: "8a-af30b200"
BlockList 1067: in User-Agents, line 53

red
Posts: 99
Joined: Mar 09 2010, 16:25

Re: opera处理Accept-Encoding讨论

Post by red » Dec 21 2010, 23:43

之前想复杂了。
没被这条规则 Top Sniff: HTML Content: HTML 10.10.16 (multi) [sd] (d.1) 匹配到,后面的内容就都不显示了
opera对Content-Encoding: deflate的处理是有问题

User avatar
phoenix
Site Admin
Posts: 525
Joined: Dec 29 2007, 16:27

Re: opera处理Accept-Encoding讨论

Post by phoenix » Dec 22 2010, 16:00

我之前说的不对。

gzip或deflate压缩后的内容,经Proxomitron解压缩过滤后,不会被重新压缩,所以浏览器接收到的是解压缩后的内容。

我试了强制FireFox发送“Accept-Encoding: deflate”,一样不能显示页面。

所以看起来是Proxomitron处理deflate后的内容有问题。

ddbb
Moderator
Posts: 425
Joined: Jan 07 2008, 13:30

Re: opera处理Accept-Encoding讨论

Post by ddbb » Dec 22 2010, 17:11

其实这个问题把我搞得......... :cry:

最开始我确定这个是sidki新版 用web filter出现的问题....

因为把head filter关掉 还是会有问题 所以我就把目标定在web filter上

可是我手动把里面的规则全从true改出false 也就是说 实际是应该没有一个规则起作用的...

不行........画面还是出不来......

可我把web filter关掉 画面就出来了......=.=

搞死我了.....不知道怎么回事......

而且手头还有个2007-09-09的......试了一下.......居然能正常显示........ :o

不知道了 我是真的不知道了.....

User avatar
phoenix
Site Admin
Posts: 525
Joined: Dec 29 2007, 16:27

Re: opera处理Accept-Encoding讨论

Post by phoenix » Dec 22 2010, 22:20

ddbb wrote: 而且手头还有个2007-09-09的......试了一下.......居然能正常显示........ :o
2007-09-09的相应规则如下:

Code: Select all

[HTTP headers]
In = FALSE
Out = TRUE
Key = "Accept-Encoding: 2 gzip     4.11.22 [srl] (d.1) (Out)"
URL = "^$TST(volat=*.encoded:1.*)|$TST(keyword=*.a_web.*)"
Replace = "gzip, x-gzip, deflate"
gzip 在 deflate 前,服务器返回了gzip压缩的内容,所以显示正常。

之后的版本,规则改成了:

Code: Select all

[HTTP headers]
In = FALSE
Out = TRUE
Key = "Accept-Encoding: 2 GZip only     07.11.16 [srl] (d.1) (Out)"
URL = "^$TST(volat=*.encoded:1.*)|$TST(keyword=*.a_web.*)"
Match = "?&*(gzip|x-gzip|deflate)\1(*(, (gzip|x-gzip|deflate))\#)+|"
Replace = "\1\@"
不修改浏览器发送的 Accept-Encoding 值的顺序,而恰恰 opera 又是把 deflate 放在最前面,所以服务器返回了 deflate 压缩的内容,然后就悲剧了。

red
Posts: 99
Joined: Mar 09 2010, 16:25

Re: opera处理Accept-Encoding讨论

Post by red » Dec 22 2010, 22:27

再给个test页面http://www.hi-pda.com/forum/index.php
强制指定使用Accept-Encoding: deflate 服务器也返回了deflate 页面正常显示

User avatar
phoenix
Site Admin
Posts: 525
Joined: Dec 29 2007, 16:27

Re: opera处理Accept-Encoding讨论

Post by phoenix » Dec 23 2010, 09:34

这是因为 deflate 不是个严格的标准,不同的服务器/软件有不同的实现。

Proxomitron可以处理大部分常见的 deflate 实现,对于少数不符常规的 deflate 实现则无能为力,而 opera 的容错性/兼容性显然更好。

针对此问题,应该尽量减少让服务器使用 deflate 压缩的机会,我建议切换回 "Accept-Encoding: 2 gzip 4.11.22 [srl] (d.1) (Out)" 这个版本。

User avatar
phoenix
Site Admin
Posts: 525
Joined: Dec 29 2007, 16:27

Re: opera处理Accept-Encoding讨论

Post by phoenix » Jan 11 2011, 23:04

针对此问题,JJoe对规则进行了改进。

Code: Select all

[HTTP headers]
In = FALSE
Out = TRUE
Key = "Accept-Encoding: 2 GZip first|specified     10.12.30 [srl] (d.1) (Out) MOD"
URL = "^$TST(volat=*.encoded:1.*)|$TST(keyword=*.a_web.*)"
Match = "$TST(s_AccEnc=?*)|(?*,&((*, )++(gzip$SET(1=,gzip)|x-gzip$SET(2=,x-gzip)|deflate$SET(3=,deflate)))+{1,3}$SET(s_AccEnc=\1\3)$TST(s_AccEnc=,\5)$SET(s_AccEnc=\5))"
Replace = "$GET(s_AccEnc)$SET(s_AccEnc=)"
Exceptions lists entry

Code: Select all

## Specify Accept-Encoding header     $SET(s_AccEnc=)$SET(0=s_AccEnc:.)
##
## Accept-Encoding header filter's default behaviour:
## "gzip", "x-gzip", and "deflate" methods may be decompressed by the Proxomitron.
## When any of these methods are present, other methods are removed from the header.
## When any of these methods are present, order is gzip,x-gzip,deflate
## When not present, the header is not altered by the filter.
##
## Use $SET(s_AccEnc=) to modify the filter's behaviour.
## Example:
##   Specify gzip at foobar.com/:
##   foobar.com/ $SET(s_AccEnc=gzip) $SET(0=s_AccEnc:gzip.)
## OR
## Edit "$SET(s_AccEnc=\1\2\3)" in the "Accept-Encoding: 2 GZip first|specified" header filter.
##
原讨论帖: http://prxbx.com/forums/showthread.php?tid=1711

Post Reply