都怪 @oflyhigh 昨天發了一篇很有趣的技術文章 - <a href="https://steemit.com/cn/@oflyhigh/4j3232-python">「Python的程序中使用集合計算簡化問題處理」</a>,搞得我有點心癢,於是便上網做了些簡單的研究。現在是時候讓我回敬一篇技術文章了,哈哈。 事先聲明,我在編程方面可不是專家,有錯的話請不要打臉,手下留情。 --- <h2><center>主題:如何能簡單有效地找出兩個集合之間的差</center></h2> <center></center> 我們先定義兩個集合分別為A和B,而它們的大小分別為 len(A) 和 len(B) 。現在我們希望求的就是 A - B。 o大哥首先提出了一個自家的方式(方法一),用文字寫出來就是: ``` 對於在集合A中的每個元素,如果它不屬於集合B,那麼便把它加到一個新的集合。 (for every element in set A, add to a new set if it is not in set B.) ``` <br> o大哥其後提出的方法就是用Python的內置功能去求出差集(方法二)。 當然方法二就簡潔得多啦,但簡潔不代表快,到底方法一還是方法二會快一些呢? --- <h2><center>雜湊表 Hash table</center></h2> 經過一些資料搜集,我發現Python是用Hash Table的方式把Set實現。甚麼是Hash Table呢? 首先Hash就是把起始資料透過一個函數去變成另一個數值。現在用一下o大哥昨天的例子,假設我們有一個集合:{'oflyhigh', 'ace108', 'laodr'}。經過某個雜湊函數的處理後,假設它變成了{1, 4, 2}。 <center>https://steemitimages.com/DQmUHc32PsGhpRZBSKqy3etu4V64HALFd82wWZvwduo7JwM/image.png</center> 好啦,現在是時候找出 @deanliu 有沒有投票了!我們先把 'deanliu' 掉進雜湊函數,假設得出 3。於是我們立即前往第3格櫃子,一拉開,哦,櫃子是空的,沒有 @deanliu !於是我們用了一步便知道他不在集合當中了。又例如要看看 @ace108 有沒有投票的話,我們先把 'ace108 ' 掉進 雜湊函數,得出4,立即前往第4格櫃子,拉開,找到 @ace108 了! 當然,如果雜湊函數寫得不好,有機會'oflyhigh', 'ace108', 'laodr'三個人都被掉進同一格櫃子,那麼要找出某人是否有投票的話,拉開了櫃子還要仔細尋找,就慢很多了。 --- <h2><center>方法一和方法二的大比拼</center></h2> <h4>方法一</h4> 回顧一下 ``` 對於在集合A中的每個元素,如果它不屬於集合B,那麼便把它加到一個新的集合。 (for every element in set A, add to a new set if it is not in set B.) ``` <br> 針對```如果它不屬於集合B,那麼便把它加到一個新的集合。```這個句子,一般來說我們只好在集合B逐項尋找,所以用上的平均時間與 len(B) 有關。由於需要對於在集合A中的每個元素都做一次,所以用上的時間需要乘以 len(A)。 > 準確一點來說,方法一的平均時間複雜度 = O(len(A)len(B))。 <h4>方法二</h4> 仔細看一看Python的<a href = "https://wiki.python.org/moin/TimeComplexity">documentation</a>,發現其實它外層的算法跟方法一其實一樣!但是對於```如果它不屬於集合B,那麼便把它加到一個新的集合。```這個句子,由於Python是用了雜湊表的技術,不論集合B的大小,它一步就做到了!然後由於```對於在集合A中的每個元素```這個句子,於是跟方法一一樣,用上的時間需要乘以 len(A)。 > 準確一點來說,方法二的平均時間複雜度 = O(len(A))。 其實時間差異就全在於```它不屬於集合B```的判斷方法。顯而易見,方法二要比方法一快很多吧! --- <h2><center>結論</center></h2> 結論非常簡單,就是Python的內置方法比我們簡單構想出來的方法快得多(這也很合理啊,寫Python的人是專業編程師啊)。不過當集合不是太大,其實兩個方法的時間差別不大,肉眼根本就辨別不了,例如要抓出誰人沒點讚,集合頂多是幾百人,兩個方法時間上相差無幾啦。而且要抓出誰人沒點讚也不是甚麼爭分奪秒的事情吧? XD 怎樣也好,也要多謝 @oflyhigh 帶來了這樣有趣的文章,讓我腦筋有動起來的機會啦 :) --- <b>參考</b> 1. 時間複雜度: https://zh.wikipedia.org/wiki/%E6%97%B6%E9%97%B4%E5%A4%8D%E6%9D%82%E5%BA%A6 2. Python documentation: https://wiki.python.org/moin/TimeComplexity 3. 雜湊表 https://zh.wikipedia.org/wiki/%E5%93%88%E5%B8%8C%E8%A1%A8
author | kenchung |
---|---|
permlink | python |
category | cn |
json_metadata | {"tags":["cn","cn-programming","python"],"users":["oflyhigh","deanliu","ace108"],"image":["https://steemitimages.com/DQmeoquHhJ6m6tb2hg88CNxLwBRwDLgR8u4F7sewLE2JHuo/image.png","https://steemitimages.com/DQmUHc32PsGhpRZBSKqy3etu4V64HALFd82wWZvwduo7JwM/image.png"],"links":["https://steemit.com/cn/@oflyhigh/4j3232-python","https://wiki.python.org/moin/TimeComplexity","https://zh.wikipedia.org/wiki/%E6%97%B6%E9%97%B4%E5%A4%8D%E6%9D%82%E5%BA%A6","https://zh.wikipedia.org/wiki/%E5%93%88%E5%B8%8C%E8%A1%A8"],"app":"steemit/0.1","format":"markdown"} |
created | 2017-07-24 07:06:30 |
last_update | 2017-07-24 07:10:09 |
depth | 0 |
children | 11 |
last_payout | 2017-07-31 07:06:30 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 76.802 HBD |
curator_payout_value | 21.727 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 2,405 |
author_reputation | 41,181,348,504,685 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,501,644 |
net_rshares | 25,549,096,459,153 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
abit | 0 | 20,378,208,333,431 | 30% | ||
rok-sivante | 0 | 1,191,797,000,304 | 100% | ||
robrigo | 0 | 669,569,857,002 | 100% | ||
biophil | 0 | 74,075,637,238 | 100% | ||
fundurian | 0 | 28,694,893,123 | 31% | ||
blueorgy | 0 | 4,300,755,175 | 0.1% | ||
ubg | 0 | 204,195,051 | 1% | ||
logic | 0 | 1,388,629,487 | 10% | ||
magicmonk | 0 | 44,071,650,374 | 100% | ||
myfirst | 0 | 13,392,116,118 | 36% | ||
oflyhigh | 0 | 451,955,100,335 | 30% | ||
yuxi | 0 | 816,474,878 | 100% | ||
ozymandias | 0 | 13,089,272,871 | 100% | ||
adialam | 0 | 3,414,493,461 | 100% | ||
jubi | 0 | 5,108,518,582 | 100% | ||
steemav | 0 | 1,551,848,809 | 100% | ||
marjuki95 | 0 | 270,694,725 | 100% | ||
lawrenceho84 | 0 | 51,801,342,234 | 12% | ||
ygern | 0 | 8,515,341,708 | 52% | ||
trafalgar | 0 | 1,887,883,238,694 | 9% | ||
cqf | 0 | 132,167,089,893 | 35% | ||
htliao | 0 | 14,711,474,772 | 65% | ||
aarkay | 0 | 74,716,596 | 100% | ||
ojaber | 0 | 63,113,616,458 | 100% | ||
susanlo | 0 | 28,311,255,057 | 80% | ||
goodaytraders | 0 | 208,390,398,994 | 100% | ||
guyverckw | 0 | 27,034,398,111 | 80% | ||
nanosesame | 0 | 2,289,401,370 | 100% | ||
kenchung | 0 | 23,045,212,515 | 100% | ||
pakyeechan | 0 | 2,041,250,193 | 80% | ||
jeffreytong | 0 | 6,685,824,178 | 100% | ||
helloworld123 | 0 | 3,218,248,961 | 100% | ||
nuagnorab | 0 | 7,480,232,390 | 100% | ||
kitcat | 0 | 28,227,072,621 | 63% | ||
rayccy | 0 | 5,529,101,096 | 100% | ||
victorier | 0 | 12,903,586,643 | 100% | ||
linuslee0216 | 0 | 15,744,999,633 | 100% | ||
chl | 0 | 17,308,622,170 | 100% | ||
fatamorgan | 0 | 11,092,907,192 | 33% | ||
wilkinshui | 0 | 24,666,025,021 | 80% | ||
crastty32 | 0 | 1,532,334,736 | 100% | ||
thomaskikansha | 0 | 23,310,004,748 | 60% | ||
mrjt | 0 | 2,105,154,644 | 100% | ||
marylaw | 0 | 5,777,093,370 | 70% | ||
crimsonclad | 0 | 1,159,139,879 | 7.5% | ||
tobykai | 0 | 6,675,559,637 | 100% | ||
krischy | 0 | 18,712,417,438 | 100% | ||
artbyrasho | 0 | 221,077,533 | 100% | ||
azlansugihen | 0 | 1,247,710,106 | 50.04% | ||
mcw | 0 | 6,730,408,200 | 30% | ||
e-bass | 0 | 551,328,431 | 100% | ||
lordp | 0 | 350,232,211 | 30% | ||
yuwineryyuvia | 0 | 1,123,155,793 | 100% | ||
boooster | 0 | 1,115,172,478 | 100% | ||
aaalejandro | 0 | 853,104,438 | 100% | ||
pobi | 0 | 174,108,168 | 100% | ||
kawaiiiiiiii030 | 0 | 1,773,002,173 | 100% | ||
fazima | 0 | 1,120,050,347 | 100% | ||
idx | 0 | 6,201,398,195 | 30% | ||
rmaxhuni | 0 | 145,243,883 | 100% | ||
leogor1234 | 0 | 998,172,917 | 100% | ||
eriza | 0 | 1,160,666,167 | 100% | ||
caucasus | 0 | 1,137,451,998 | 100% | ||
fellis | 0 | 255,345,977 | 100% | ||
khamil | 0 | 441,051,850 | 100% | ||
bish2630 | 0 | 81,246,372 | 100% | ||
tvb | 0 | 0 | 100% |
我发现通过这个方法寻找了还没点赞的朋友后可以写个程序自动COMMENT朋友的帖子叫他们点赞。
author | beautifulbella |
---|---|
permlink | re-kenchung-python-20170724t072104044z |
category | cn |
json_metadata | {"tags":["cn"],"app":"steemit/0.1"} |
created | 2017-07-24 07:21:06 |
last_update | 2017-07-24 07:21:06 |
depth | 1 |
children | 1 |
last_payout | 2017-07-31 07:21:06 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 46 |
author_reputation | 1,876,932,920,921 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,502,717 |
net_rshares | 0 |
也可以啊,這樣誰都逃不出你的五指山了 XD
author | kenchung |
---|---|
permlink | re-beautifulbella-re-kenchung-python-20170724t072202760z |
category | cn |
json_metadata | {"tags":["cn"],"app":"steemit/0.1"} |
created | 2017-07-24 07:23:18 |
last_update | 2017-07-24 07:23:18 |
depth | 2 |
children | 0 |
last_payout | 2017-07-31 07:23:18 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 21 |
author_reputation | 41,181,348,504,685 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,502,866 |
net_rshares | 0 |
SteemIt API Tool - Check If Your Followers Have Voted Your Post 撸了一个工具 - 快速检查你的粉丝到底有没有给你点赞!(带 免费API,不谢)https://steemit.com/steemit/@justyy/steemit-api-tool-check-if-your-followers-have-voted-your-post-api
author | justyy |
---|---|
permlink | re-kenchung-python-20170727t021636527z |
category | cn |
json_metadata | {"tags":["cn"],"links":["https://steemit.com/steemit/@justyy/steemit-api-tool-check-if-your-followers-have-voted-your-post-api"],"app":"steemit/0.1"} |
created | 2017-07-27 02:16:33 |
last_update | 2017-07-27 02:16:33 |
depth | 1 |
children | 2 |
last_payout | 2017-08-03 02:16:33 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 204 |
author_reputation | 280,616,224,641,976 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,861,278 |
net_rshares | 5,657,520,747 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
kenchung | 0 | 5,442,559,800 | 25% | ||
rmaxhuni | 0 | 214,960,947 | 100% | ||
tvb | 0 | 0 | 100% |
好一個小工具!
author | kenchung |
---|---|
permlink | re-justyy-re-kenchung-python-20170727t133402059z |
category | cn |
json_metadata | {"tags":["cn"],"app":"steemit/0.1"} |
created | 2017-07-27 13:34:00 |
last_update | 2017-07-27 13:34:00 |
depth | 2 |
children | 0 |
last_payout | 2017-08-03 13:34:00 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 7 |
author_reputation | 41,181,348,504,685 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,911,852 |
net_rshares | 0 |
不用查啦,我给你点了各种赞!
author | tvb |
---|---|
permlink | re-justyy-re-kenchung-python-20170815t190048298z |
category | cn |
json_metadata | {"tags":["cn"],"app":"steemit/0.1"} |
created | 2017-08-15 19:00:54 |
last_update | 2017-08-15 19:00:54 |
depth | 2 |
children | 0 |
last_payout | 2017-08-22 19:00:54 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 14 |
author_reputation | 35,178,037,825,802 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 11,908,237 |
net_rshares | 0 |
感谢分享 不过我的方式一,也不是在B中逐项查找 而是 if a[i] in b: 类似这样的判断,这个是否使用的是hash表? 以及我们遍历a的过程,在a-b做差集是否还是一样需要?(好像你说也需要)那就改成是否有优化? 其实我最喜欢的是简单粗暴的结果 比如随机生成俩超大集合 然后用1和2去计算,并打印出耗时,😄
author | oflyhigh |
---|---|
permlink | re-kenchung-python-20170724t075556304z |
category | cn |
json_metadata | {"tags":["cn"],"app":"steemit/0.1"} |
created | 2017-07-24 07:55:57 |
last_update | 2017-07-24 07:55:57 |
depth | 1 |
children | 2 |
last_payout | 2017-07-31 07:55:57 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.035 HBD |
curator_payout_value | 0.007 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 159 |
author_reputation | 6,318,597,802,984,395 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,505,190 |
net_rshares | 11,383,295,514 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
kenchung | 0 | 11,162,524,811 | 50% | ||
rmaxhuni | 0 | 220,770,703 | 100% |
在python上所有set的object都是用hash表來達成的。 關於a[i] in b,如果b是set,那就一步做完 ( O(1) );如果b是list,那就要逐個查看 ( O(n) ) Reference: https://wiki.python.org/moin/TimeComplexity 所以簡單來說,我猜你當初是打算用list的(?) ,那麼即使用上 'in',也等同於逐個查看,所以也是慢很多啦 我猜python的set difference就是這樣的: for every element in a: if a[i] is not *in* b then copy a[i] to c 粗暴的方法也好啊,但是我不太懂python,懶得學,所以就要交給你啦 XD
author | kenchung |
---|---|
permlink | re-oflyhigh-re-kenchung-python-20170724t084059306z |
category | cn |
json_metadata | {"tags":["cn"],"links":["https://wiki.python.org/moin/TimeComplexity"],"app":"steemit/0.1"} |
created | 2017-07-24 08:42:12 |
last_update | 2017-07-24 08:42:12 |
depth | 2 |
children | 1 |
last_payout | 2017-07-31 08:42:12 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 1.032 HBD |
curator_payout_value | 0.341 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 349 |
author_reputation | 41,181,348,504,685 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,508,703 |
net_rshares | 357,421,818,015 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
oflyhigh | 0 | 356,806,658,159 | 25% | ||
jordibv | 0 | 615,159,856 | 100% |
谢谢解答
author | oflyhigh |
---|---|
permlink | re-kenchung-re-oflyhigh-re-kenchung-python-20170724t091159610z |
category | cn |
json_metadata | {"tags":["cn"],"app":"steemit/0.1"} |
created | 2017-07-24 09:12:00 |
last_update | 2017-07-24 09:12:00 |
depth | 3 |
children | 0 |
last_payout | 2017-07-31 09:12:00 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 4 |
author_reputation | 6,318,597,802,984,395 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,511,128 |
net_rshares | 0 |
Some good news - you won the [Minnows Accelerator Project - Six of the Best MAP1](https://steemit.com/blogs/@rycharde/minnows-accelerator-project-six-of-the-best-map1-reward-share-and-200-sp-delegatedvote-now)! Go take a look!
author | rycharde |
---|---|
permlink | re-kenchung-python-20170727t172919363z |
category | cn |
json_metadata | {"tags":["cn"],"links":["https://steemit.com/blogs/@rycharde/minnows-accelerator-project-six-of-the-best-map1-reward-share-and-200-sp-delegatedvote-now"],"app":"steemit/0.1"} |
created | 2017-07-27 17:29:21 |
last_update | 2017-07-27 17:29:21 |
depth | 1 |
children | 0 |
last_payout | 2017-08-03 17:29:21 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.098 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 226 |
author_reputation | 19,101,504,594,449 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,934,985 |
net_rshares | 27,576,766,858 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
rycharde | 0 | 27,576,766,858 | 100% |
是个好的文章!
author | seokcus | ||||||
---|---|---|---|---|---|---|---|
permlink | re-kenchung-2017724t17758468z | ||||||
category | cn | ||||||
json_metadata | {"tags":"cn","app":"esteem/1.4.7","format":"markdown+html","community":"esteem"} | ||||||
created | 2017-07-24 08:08:00 | ||||||
last_update | 2017-07-24 08:08:00 | ||||||
depth | 1 | ||||||
children | 1 | ||||||
last_payout | 2017-07-31 08:08:00 | ||||||
cashout_time | 1969-12-31 23:59:59 | ||||||
total_payout_value | 0.000 HBD | ||||||
curator_payout_value | 0.000 HBD | ||||||
pending_payout_value | 0.000 HBD | ||||||
promoted | 0.000 HBD | ||||||
body_length | 7 | ||||||
author_reputation | 1,777,848,911,296 | ||||||
root_title | "Python的差集運算 - 何者更快?" | ||||||
beneficiaries |
| ||||||
max_accepted_payout | 1,000,000.000 HBD | ||||||
percent_hbd | 10,000 | ||||||
post_id | 9,506,031 | ||||||
net_rshares | 0 |
謝謝
author | kenchung |
---|---|
permlink | re-seokcus-re-kenchung-2017724t17758468z-20170724t085652504z |
category | cn |
json_metadata | {"tags":["cn"],"app":"steemit/0.1"} |
created | 2017-07-24 08:58:09 |
last_update | 2017-07-24 08:58:09 |
depth | 2 |
children | 0 |
last_payout | 2017-07-31 08:58:09 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 2 |
author_reputation | 41,181,348,504,685 |
root_title | "Python的差集運算 - 何者更快?" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 9,510,011 |
net_rshares | 0 |