In the third video on support vector machines (SVMs) we begin implementing an SVM on our cancer dataset in scikit-learn. We're using a support vector classifier (SVC) with an RBF (radial basis function) kernel. For an overview on kernels and how they work conceptually, please look at the previous video in this series. There are many parameters that can be adjusted for our classifier. The defaults are usually good to start with. However, for our cancer dataset, the classifier seems to be overfitting with the default parameters (as it leads to 100% performance on the training subset). To fix this we could try adjusting parameters such as the C and/or gamma which control regularization and the width of the Gaussian kernel. We could also look into the scaling of the data; it is currently unscaled. And this is what we're gonna work on in the next video. But for now, see the current tutorial on how to implement SVMs in scikit-learn. ___ As a reminder: In this series I'm going to explore the cancer dataset that comes pre-loaded with scikit-learn. The purpose is to train the classifiers on this dataset, which consists of labeled data: ~569 tumor samples, each labeled malignant or benign, and then use them on new, unlabeled data. ___ Previous videos in this series: 1. [Machine Learning on a Cancer Dataset - Part 20](https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-20) 2. [Machine Learning on a Cancer Dataset - Part 21](https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-21) 3. [Machine Learning on a Cancer Dataset - Part 22](https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-22) 4. [Machine Learning on a Cancer Dataset - Part 23](https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-23) 5. [Machine Learning on a Cancer Dataset - Part 24](https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-24) 6. [Machine Learning on a Cancer Dataset - Part 25](https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-25) 7. [Machine Learning on a Cancer Dataset - Part 26](https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-26) ___ <center><iframe width="560" height="315" src="https://www.youtube.com/embed/cciPGGnHAKQ" frameborder="0" allowfullscreen></iframe></center> ___ ### <center>To stay in touch with me, follow @cristi </center> ___ [Cristi Vlad](http://cristivlad.com), Self-Experimenter and Author
author | cristi |
---|---|
permlink | machine-learning-on-a-cancer-dataset-part-27 |
category | machine-learning |
json_metadata | {"tags":["machine-learning","science","python"],"users":["cristi"],"image":["https://img.youtube.com/vi/cciPGGnHAKQ/0.jpg"],"links":["https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-20","https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-21","https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-22","https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-23","https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-24","https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-25","https://steemit.com/machine-learning/@cristi/machine-learning-on-a-cancer-dataset-part-26","https://www.youtube.com/embed/cciPGGnHAKQ","http://cristivlad.com"],"app":"steemit/0.1","format":"markdown"} |
created | 2017-06-18 13:46:39 |
last_update | 2017-06-18 13:46:39 |
depth | 0 |
children | 6 |
last_payout | 2017-06-25 13:46:39 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 249.364 HBD |
curator_payout_value | 41.153 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 2,566 |
author_reputation | 128,305,218,872,904 |
root_title | "Machine Learning on a Cancer Dataset - Part 27" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 5,177,122 |
net_rshares | 14,352,860,106,741 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
pharesim | 0 | 804,743,386,841 | 16% | ||
sandra | 0 | 25,248,068,472 | 70% | ||
ihashfury | 0 | 9,091,213,960 | 48.3% | ||
simba | 0 | 1,698,967,147 | 100% | ||
aizensou | 0 | 318,283,573,608 | 100% | ||
jason | 0 | 23,394,072,782 | 48.3% | ||
pairmike | 0 | 88,930,733,112 | 100% | ||
tuck-fheman | 0 | 56,261,991,127 | 100% | ||
nomoreheroes7 | 0 | 24,356,006,328 | 100% | ||
gavvet | 0 | 3,065,823,154,787 | 100% | ||
piranhax | 0 | 6,534,830,744 | 100% | ||
steve-walschot | 0 | 10,334,835,780 | 100% | ||
dragonslayer109 | 0 | 319,853,782,495 | 100% | ||
thecryptofiend | 0 | 419,647,981,570 | 100% | ||
justtryme90 | 0 | 33,468,035,850 | 100% | ||
coinbitgold | 0 | 36,902,868,356 | 100% | ||
applecrisp | 0 | 107,530,192 | 50% | ||
thecryptodrive | 0 | 258,651,755,643 | 100% | ||
infovore | 0 | 283,462,667,018 | 100% | ||
schro | 0 | 3,764,227,134 | 95% | ||
geoffrey | 0 | 312,904,477,299 | 80% | ||
joshbreslauer | 0 | 280,282,804,411 | 70% | ||
slowwalker | 0 | 1,381,857,876,626 | 62% | ||
strangerarray | 0 | 14,447,107,428 | 100% | ||
jacor | 0 | 181,491,553,701 | 100% | ||
paco | 0 | 17,704,862,869 | 100% | ||
speda | 0 | 92,205,359,974 | 100% | ||
igster | 0 | 42,441,110,661 | 100% | ||
bola | 0 | 1,182,999,176 | 1% | ||
sauravrungta | 0 | 80,573,845,589 | 100% | ||
crazymumzysa | 0 | 23,423,563,437 | 100% | ||
prufarchy | 0 | 113,190,250,564 | 100% | ||
anca3drandom | 0 | 104,180,540,615 | 100% | ||
macchicken | 0 | 222,287,110 | 100% | ||
sergey44 | 0 | 414,732,809 | 100% | ||
team-leibniz | 0 | 46,252,594,007 | 100% | ||
logic | 0 | 1,254,687,169 | 100% | ||
drbec | 0 | 656,030,638 | 95% | ||
transhuman | 0 | 2,119,664,625 | 44% | ||
cmp2020 | 0 | 52,787,191,937 | 86% | ||
timsaid | 0 | 92,537,015,843 | 100% | ||
aleksandraz | 0 | 33,391,482,978 | 100% | ||
cristi | 0 | 234,510,498,985 | 100% | ||
scaredycatguide | 0 | 50,166,232,527 | 50% | ||
stylo | 0 | 1,942,386,881 | 100% | ||
lemouth | 0 | 28,549,118,826 | 100% | ||
gammagooblin | 0 | 6,316,255,729 | 100% | ||
lyudmilka | 0 | 59,536,207 | 100% | ||
neptun | 0 | 2,832,943,645 | 100% | ||
creadordelfuturo | 0 | 104,025,762,413 | 50% | ||
jyp | 0 | 199,231,572,943 | 100% | ||
inchonbitcoin | 0 | 331,398,694,674 | 100% | ||
krnel | 0 | 35,293,820,862 | 50% | ||
penguinpablo | 0 | 193,096,513,951 | 76% | ||
greatness | 0 | 1,028,320,759 | 100% | ||
ghasemkiani | 0 | 12,827,890,238 | 100% | ||
wise-elf | 0 | 59,295,340 | 100% | ||
profitgenerator | 0 | 31,346,997,809 | 100% | ||
asksisk | 0 | 130,999,878,887 | 100% | ||
damarth | 0 | 464,391,748,890 | 100% | ||
aksinya | 0 | 14,980,043,694 | 100% | ||
remlaps1 | 0 | 14,276,708,805 | 86% | ||
bosjaya | 0 | 153,899,512 | 60% | ||
cub1 | 0 | 6,591,545,696 | 86% | ||
rarcntv | 0 | 75,431,287 | 16% | ||
olesya1989 | 0 | 156,121,517 | 100% | ||
mitchelljaworski | 0 | 8,382,007,126 | 100% | ||
keuudeip | 0 | 6,158,909,944 | 100% | ||
steemstem | 0 | 1,138,209,493,249 | 100% | ||
mokluc | 0 | 4,864,141,283 | 100% | ||
dragosroua | 0 | 52,678,750,521 | 100% | ||
raluca | 0 | 2,925,899,290 | 100% | ||
selwi | 0 | 851,978,329 | 100% | ||
angel76 | 0 | 17,050,673,810 | 100% | ||
zambez | 0 | 159,344,310 | 100% | ||
chessminator | 0 | 132,727,169 | 100% | ||
goldsteem | 0 | 81,965,114,043 | 100% | ||
blackchen | 0 | 75,846,638,032 | 100% | ||
detol | 0 | 1,519,683,323 | 100% | ||
vcelier | 0 | 619,392,524,338 | 100% | ||
timothyb | 0 | 10,409,088,935 | 100% | ||
tbraun | 0 | 501,024,237 | 100% | ||
detlef-s | 0 | 1,589,521,600 | 100% | ||
richa.proffy | 0 | 319,323,603 | 100% | ||
remlaps2 | 0 | 67,422,467 | 100% | ||
kochmaster | 0 | 263,651,496 | 50% | ||
elgeko | 0 | 18,478,534,614 | 50% | ||
lisa.palmer | 0 | 1,564,847,911 | 86% | ||
middle-theory | 0 | 10,816,478,704 | 100% | ||
hagbardceline | 0 | 34,406,987,274 | 100% | ||
trafalgar | 0 | 1,303,754,884,019 | 24% | ||
crawfish37 | 0 | 1,614,195,905 | 100% | ||
thomasgutierrez | 0 | 2,827,145,515 | 100% | ||
cqf | 0 | 62,842,130,533 | 100% | ||
aismor | 0 | 609,907,847 | 100% | ||
fingersik | 0 | 1,705,379,966 | 100% | ||
velimir | 0 | 447,371,127 | 10% | ||
saitirumerla | 0 | 13,390,341,229 | 100% | ||
synapse | 0 | 674,308,276 | 100% | ||
passion-fruit | 0 | 4,744,143,854 | 100% | ||
fortune-master | 0 | 5,142,722,862 | 100% | ||
aarkay | 0 | 286,413,618 | 100% | ||
cerebralace | 0 | 206,697,343 | 100% | ||
bosman | 0 | 1,518,875,072 | 100% | ||
schro.one | 0 | 2,090,055,500 | 95% | ||
edwinvanrooij | 0 | 2,536,709,740 | 100% | ||
matiasrodrigo | 0 | 135,808,481 | 100% | ||
sneakgeekz | 0 | 7,304,560,108 | 25% | ||
dhn0411 | 0 | 1,033,792,745 | 100% | ||
macka137 | 0 | 1,469,824,303 | 100% | ||
fifthangel | 0 | 13,672,868,960 | 100% | ||
rublevoy | 0 | 533,298,082 | 100% | ||
maria-k | 0 | 719,559,780 | 100% | ||
mkotibabu | 0 | 1,075,518,658 | 100% | ||
aguayojoshua | 0 | 636,068,226 | 100% | ||
kaiching77 | 0 | 1,225,621,359 | 100% | ||
choboscientist | 0 | 290,716,946 | 100% | ||
randowhale | 0 | 262,525,718,982 | 4.09% | ||
sarmins | 0 | 278,570,083 | 100% | ||
jean.racines | 0 | 544,492,709 | 100% | ||
financialcritic | 0 | 34,688,795,264 | 100% | ||
jupiter5 | 0 | 92,856,667 | 100% | ||
teutorigos | 0 | 354,015,928 | 100% | ||
ekan | 0 | 185,713,245 | 100% | ||
dijana969 | 0 | 174,106,096 | 100% | ||
raymondc | 0 | 290,176,630 | 100% | ||
arckrai | 0 | 145,088,301 | 100% | ||
brnofre | 0 | 774,315,621 | 100% | ||
zeryius | 0 | 266,962,011 | 100% | ||
belike | 0 | 290,175,806 | 100% | ||
ramini68 | 0 | 284,372,040 | 100% | ||
veleje | 0 | 237,943,821 | 100% | ||
canwaals | 0 | 290,175,386 | 100% |
Could you explain a bit what what RVC and SBF are? (sorry if I got the acronyms wrong, I'm on mobile) XD
author | aguayojoshua | ||||||
---|---|---|---|---|---|---|---|
permlink | re-cristi-2017618t102824371z | ||||||
category | machine-learning | ||||||
json_metadata | {"tags":"machine-learning","app":"esteem/1.4.5","format":"markdown+html","community":"esteem"} | ||||||
created | 2017-06-18 14:28:24 | ||||||
last_update | 2017-06-18 14:28:24 | ||||||
depth | 1 | ||||||
children | 1 | ||||||
last_payout | 2017-06-25 14:28:24 | ||||||
cashout_time | 1969-12-31 23:59:59 | ||||||
total_payout_value | 0.000 HBD | ||||||
curator_payout_value | 0.000 HBD | ||||||
pending_payout_value | 0.000 HBD | ||||||
promoted | 0.000 HBD | ||||||
body_length | 104 | ||||||
author_reputation | 469,235,892,088 | ||||||
root_title | "Machine Learning on a Cancer Dataset - Part 27" | ||||||
beneficiaries |
| ||||||
max_accepted_payout | 1,000,000.000 HBD | ||||||
percent_hbd | 10,000 | ||||||
post_id | 5,180,070 | ||||||
net_rshares | 0 |
SVC is the support vector classifier and RBF is the Gaussian kernel or radial basis function kernel, and these have been explained in the previous [video](https://www.youtube.com/watch?v=404knXpDaPM)
author | cristi |
---|---|
permlink | re-aguayojoshua-re-cristi-2017618t102824371z-20170618t145126972z |
category | machine-learning |
json_metadata | {"tags":["machine-learning"],"links":["https://www.youtube.com/watch?v=404knXpDaPM"],"app":"steemit/0.1"} |
created | 2017-06-18 14:50:33 |
last_update | 2017-06-18 14:50:33 |
depth | 2 |
children | 0 |
last_payout | 2017-06-25 14:50:33 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 199 |
author_reputation | 128,305,218,872,904 |
root_title | "Machine Learning on a Cancer Dataset - Part 27" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 5,181,731 |
net_rshares | 0 |
Machine Learning is so interesting. Thank you for this Post!
author | arckrai |
---|---|
permlink | re-cristi-machine-learning-on-a-cancer-dataset-part-27-20170618t141451579z |
category | machine-learning |
json_metadata | {"tags":["machine-learning"],"app":"steemit/0.1"} |
created | 2017-06-18 14:14:54 |
last_update | 2017-06-18 14:14:54 |
depth | 1 |
children | 1 |
last_payout | 2017-06-25 14:14:54 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 60 |
author_reputation | 16,610,122,450,873 |
root_title | "Machine Learning on a Cancer Dataset - Part 27" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 5,179,053 |
net_rshares | 145,088,301 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
arckrai | 0 | 145,088,301 | 100% |
you're welcome.
author | cristi |
---|---|
permlink | re-arckrai-re-cristi-machine-learning-on-a-cancer-dataset-part-27-20170618t142123878z |
category | machine-learning |
json_metadata | {"tags":["machine-learning"],"app":"steemit/0.1"} |
created | 2017-06-18 14:20:30 |
last_update | 2017-06-18 14:20:30 |
depth | 2 |
children | 0 |
last_payout | 2017-06-25 14:20:30 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 15 |
author_reputation | 128,305,218,872,904 |
root_title | "Machine Learning on a Cancer Dataset - Part 27" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 5,179,471 |
net_rshares | 145,088,301 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
arckrai | 0 | 145,088,301 | 100% |
This post received a 4.1% upvote from @randowhale thanks to @cristi! For more information, [click here](https://steemit.com/steemit/@randowhale/introducing-randowhale-will-you-get-the-100-vote-give-it-a-shot)!
author | randowhale |
---|---|
permlink | re-machine-learning-on-a-cancer-dataset-part-27-20170624t145055 |
category | machine-learning |
json_metadata | "{"app": "randowhale/0.1", "format": "markdown"}" |
created | 2017-06-24 14:50:57 |
last_update | 2017-06-24 14:50:57 |
depth | 1 |
children | 0 |
last_payout | 2017-07-01 14:50:57 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.028 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 210 |
author_reputation | 47,657,457,485,459 |
root_title | "Machine Learning on a Cancer Dataset - Part 27" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 5,918,127 |
net_rshares | 2,563,895,891 |
author_curate_reward | "" |
voter | weight | wgt% | rshares | pct | time |
---|---|---|---|---|---|
nextgencrypto | 0 | 2,563,895,891 | 1% |
good luck you with your post)
author | rublevoy |
---|---|
permlink | re-cristi-machine-learning-on-a-cancer-dataset-part-27-20170618t142357423z |
category | machine-learning |
json_metadata | {"tags":["machine-learning"],"app":"steemit/0.1"} |
created | 2017-06-18 14:24:00 |
last_update | 2017-06-18 14:24:00 |
depth | 1 |
children | 0 |
last_payout | 2017-06-25 14:24:00 |
cashout_time | 1969-12-31 23:59:59 |
total_payout_value | 0.000 HBD |
curator_payout_value | 0.000 HBD |
pending_payout_value | 0.000 HBD |
promoted | 0.000 HBD |
body_length | 29 |
author_reputation | 220,503,083,310 |
root_title | "Machine Learning on a Cancer Dataset - Part 27" |
beneficiaries | [] |
max_accepted_payout | 1,000,000.000 HBD |
percent_hbd | 10,000 |
post_id | 5,179,726 |
net_rshares | 0 |