Natanael

Natanael@slrpnk.net · 2 months ago

CF has multiple options, you can use them as just a load balancer/firewall while handling your own TLS cert. I think most let them hold the cert so they can get CF caching services though

Natanael@slrpnk.net · 2 months ago

Yacy

Natanael@slrpnk.net · 2 months ago

Robots can definitely flip burgers.

Some can even do it twice!

Natanael@slrpnk.net · 2 months ago

Exclusively using Discord as a support channel should get you banned from the internet

Natanael@slrpnk.net · 2 months ago

An application password, basically

Natanael@slrpnk.net · 2 months ago

I have a frozen license with them which they’ll reactivate once I give them the receipt information they didn’t send me when I bought it from them…

Natanael@slrpnk.net · 2 months ago

I have a lifetime license from another company that got deactivated for similar reasons, and support is useless because they demand information I wasn’t given when buying it from them directly

Natanael@slrpnk.net · edit-2 3 months ago

Humans learn a lot through repetition, no reason to believe that LLMs wouldn’t benefit from reinforcement of higher quality information. Especially because seeing the same information in different contexts helps mapping the links between the different contexts and helps dispel incorrect assumptions. But like I said, the only viable method they have for this kind of emphasis at scale is incidental replication of more popular works in its samples. And when something is duplicated too much it overfits instead.

They need to fundamentally change big parts of how learning happens and how the algorithm learns to fix this conflict. In particular it will need a lot more “introspective” training stages to refine what it has learned, and pretty much nobody does anything even slightly similar on large models because they don’t know how, and it would be insanely expensive anyway.

Natanael@slrpnk.net · edit-2 3 months ago

Yes, but should big companies with business models designed to be exploitative be allowed to act hypocritically?

My problem isn’t with ML as such, or with learning over such large sets of works, etc, but these companies are designing their services specifically to push the people who’s works they rely on out of work.

The irony of overfitting is that both having numerous copies of common works is a problem AND removing the duplicates would be a problem. They need an understanding of what’s representative for language, etc, but the training algorithms can’t learn that on their own and it’s not feasible go have humans teach it that and also the training algorithm can’t effectively detect duplicates and “tune down” their influence to stop replicating them exactly. Also, trying to do that latter thing algorithmically will ALSO break things as it would break its understanding of stuff like standard legalese and boilerplate language, etc.

The current generation of generative ML doesn’t do what it says on the box, AND the companies running them deserve to get screwed over.

And yes I understand the risk of screwing up fair use, which is why my suggestion is not to hinder learning, but to require the companies to track copyright status of samples and inform ends users of licensing status when the system detects a sample is substantially replicated in the output. This will not hurt anybody training on public domain or fairly licensed works, nor hurt anybody who tracks authorship when crawling for samples, and will also not hurt anybody who has designed their ML system to be sufficiently transformative that it never replicates copyrighted samples. It just hurts exploitative companies.

Natanael@slrpnk.net · edit-2 3 months ago

Remember when media companies tried to sue switch manufacturers because their routers held copies of packets in RAM and argued they needed licensing for that?

https://www.eff.org/deeplinks/2006/06/yes-slashdotters-sira-really-bad

Training an AI can end up leaving copies of copyrightable segments of the originals, look up sample recover attacks. If it had worked as advertised then it would be transformative derivative works with fair use protection, but in reality it often doesn’t work that way

See also

https://curia.europa.eu/juris/liste.jsf?nat=or&mat=or&pcs=Oor&jur=C%2CT%2CF&for=&jge=&dates=&language=en&pro=&cit=none%252CC%252CCJ%252CR%252C2008E%252C%252C%252C%252C%252C%252C%252C%252C%252C%252Ctrue%252Cfalse%252Cfalse&oqp=&td=%3BALL&avg=&lgrec=en&parties=Football%2BAssociation%2BPremier%2BLeague&lg=&page=1&cid=10711513

Natanael@slrpnk.net · 9 months ago

They definitely should be cooperating just for the sake of ensuring updates to drivers, etc, are pushed upstream and maintained properly. It makes OS updates much easier for them when they don’t need to recompile and tweak stuff every single time they pull Valve’s most recent kernel updates and whatnot

Natanael@slrpnk.net · 10 months ago

Or IPFS. The issue in this context is that bittorrent would treat each version as a unique collection of files and you can’t combine seeding of redundant files. IPFS has much better means to handle updates.

Natanael@slrpnk.net · 10 months ago

It’s optional for devs so plenty of games don’t use it at all

Natanael@slrpnk.net · 11 months ago

That’s why they all try to buy each other

Natanael@slrpnk.net · 11 months ago

Yup, unless you specifically set it to use one of the few outproxies then it’s by default just for connecting to other peers within the I2P network

Natanael@slrpnk.net · edit-2 1 year ago

Your scenario would specifically require the cops to ask their techs for a detailed report and then deliberately lie about it’s conclusions to attack completely random people, and just FYI the last few rounds of this happened when public WiFi was new and the cops kept losing so badly in courts that this doesn’t really happen much anymore. You don’t even need a great lawyer, just an average one who can find the precedence.

There’s no “additional fingerprints” of relevance binding any node in a tunnel to the communications in the tunnel. It uses PFS and multiple layers of encryption (tunnels within tunnels). They need to run a debugger against their node to have any chance to really argue that a specific packet came from a specific node, which also would ironically simultaneously prove that node didn’t actually know and was just a blind relay (just like how mailmen aren’t liable for content of packages they deliver).

Your argument is literally being used to argue that nobody should have privacy because those who don’t break laws don’t need it, yet you yourself are arguing for why we still need privacy if we haven’t broken laws. The collateral damage when such tools aren’t available is so much greater than when privacy tools are available. One of the greatest successes of Signal is how its popularity makes each of its users part of a “haystack” (large anonymity set) and targeting individual users just for using it is infeasible, protecting endless numbers of minorities and other at-risk individuals.

In addition, it’s extremely rare that mass surveillance like spying on network traffic leads to prosecutions. It’s usually infiltration that works, so you running an I2P node will make zero difference.

Natanael@slrpnk.net · edit-2 1 year ago

1: then they would go after literally anybody running a node

2: their client will not see peers on another IP. It will just see their own I2P node. Any I2P aware software will also not have any IP addresses as peers, only I2P specific internal addresses. They will not even be able to associate an incoming connection to any one node without understanding the I2P network statistics console.

3: by this argument all anonymization tools should be illegal, Signal too, etc, and nobody should help anybody maintain privacy. In the real world there’s plenty of reasons why anonymization tools are necessary. And there will be literally zero evidence tying you to a crime. Preexisting legal precedence says an IP address alone is not enough.

Natanael@slrpnk.net · 1 year ago

Torrents are literally built around file hashes so yes

Natanael@slrpnk.net · 1 year ago

This is not how the law is applied to packet switching.

If it was store and forward then maybe just maybe law enforcement would care, but anybody smart enough to set up an I2P node to research it and who tried to track where packets from from would first see the packets originate from their own local node at 127.0.0.1, then in the I2P console they could see that packet came in via an active half-tunnel from their own end interfacing with the endpoint node of the other side’s half-tunnel, and they would know that node has no idea what it’s sending (just like their ISP)

Natanael@slrpnk.net · 1 year ago

I2P doesn’t behave like Tor by default, it’s designed around connecting to internal peers within its network so your browser won’t treat it as a proxy but default and you have to specifically configure it to route traffic to the I2P network