People have been writing as if the amendments to
SB 1047 made on Monday, August 19th or yesterday (August 22nd) assuage the fears that I have been talking about.
I once again implore people to read the
Bill Text, instead of relying on commentary about what the bill says.
There have been some improvements, and some regressions. Ultimately, this bill
remains regulatory capture of software by "Effective Altruists." (EAs) (
EA is the utilitarian philosophy of rationalized subjugation followed by the likes of Sam Bankman-Fried)
Read the following definition in 22602 (b) and tell me that by "Artificial Intelligence," they don't identically mean "Software":
(b) “Artificial intelligence” means an engineered or machine-based system that varies in its level of autonomy and that can, for explicit or implicit objectives, infer from the input it receives how to generate outputs that can influence physical or virtual environments.
The fundamental issue is that the bill aims to regulate models and not deployments. Trying to fix things may just be too hard.
Again, I am just someone who reads English who needs to be aware of this regulation. I also want to say,
it sucks that as just a freelancer, that I have to aware of this bill. --Chilling Effect : Wasting time on being aware of this, instead of talking to potential clients and building solutions for them.
Not allowing people to,
at this stage,
experiment with abandon (when much of it is yet to be deployed) comes mainly come from bad analogies, and ultimately a
Pascal's Mugging of the riches of the industry by the EAs (Anthropic in particular, but a lot of the
Big Incumbent Safety labs - with the "Effective Altruism" tradition).
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
The language around safety incidents now will affect a lot more people. It does not seem carefully thought out when to use models or covered model derivatives. This is
even closer to defining thought crime.
(c) “Artificial intelligence safety incident” means an incident that demonstrably increases the risk of a critical harm occurring by means of any of the following:
(1) A covered model or covered model derivative autonomously engaging in behavior other than at the request of a user.
(2) Theft, misappropriation, malicious use, inadvertent release, unauthorized access, or escape of the model weights of a covered model. model or covered model derivative.
(3) The critical failure of technical or administrative controls, including controls limiting the ability to modify a covered model. model or covered model derivative.
(4) Unauthorized use of a covered model or covered model derivative to cause or materially enable critical harm
*Strikethroughs are removed text, italics are added. You can see the
text comparison by comparing the bill to the 07/03/2024 version.
I talked about how you can run
a copy of whatever the equivalent of LLama 3.1 is at the end of next year for about $33/hour. There is so much you could do with that, which now instead you will have to hire a lawyer to comply with SB 1047. This means what many estimate $0.5M is legal fees just to start.
Copying an open source model is something even a freelancer can do. Can you with a straight face,
tell me honestly that the above changes weren't written specifically for Big Tech to win? For the same reason that Voter IDs disenfranchise people from voting, these sorts of things disenfranchise people from feeding their family.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
I think as a response to complaints showing very plainly that the
bill is working on compute limits and not cost, some language around cost has been added in many places.
(ii) An artificial intelligence model created by fine-tuning a covered model using a quantity of computing power equal to or greater than three times 10^25 integer or floating-point operations. operations, the cost of which, as reasonably assessed by the developer, exceeds ten million dollars ($10,000,000) if calculated using the average market price of cloud compute at the start of fine-tuning.
Tell me,
does this read like a it is actually a money limit, or is it just that somebody said that this implies a money limit (which it clearly wont have in 10 years if Moore's Law continues).
This clearly locks-in the current Cloud-Barrons of Amazon(Anthropic), Microsoft(OpenAI), and Google.
Using,
Linode,
Vultr, or on-premise compute can often be cheaper (and mode secure). But the cost savings won't matter in terms of having to hire a team of lawyers, because it won't bring you under the cost limit.
Note also that it says $10M dollars, which is easily a Series B amount of money.
So it clearly affects startups. This is just for fine tuning. But again,
copying is something even an individual of modest means can do.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
They replaced the Frontier Model Division with
Government Operations Agency. I am assuming it is
this one. So instead of new agency, we use an existing one. I suppose that could be a good thing? IDK.
This was one of the things Anthropic pushed back on. I am wondering if this was just theater. (Because why?) That way they can say they pushed back and now they accept it?
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Further, here are the rules for computing clusters:
22604.
(a) A person that operates a computing cluster shall implement written policies and procedures to do all of the following when a customer utilizes compute resources that would be sufficient to train a covered model:
(1) Obtain the prospective customer’s basic identifying information and business purpose for utilizing the computing cluster, including all of the following:
(A) The identity of the prospective customer.
(B) The means and source of payment, including any associated financial institution, credit card number, account number, customer identifier, transaction identifiers, or virtual currency wallet or wallet address identifier.
(C) The email address and telephonic contact information used to verify the prospective customer’s identity.
(2) Assess whether the prospective customer intends to utilize the computing cluster to train a covered model.
(3) If a customer repeatedly utilizes computer resources that would be sufficient to train a covered model, validate the information initially collected pursuant to paragraph (1) and conduct the assessment required pursuant to paragraph (2) prior to each utilization.
(4) Retain a customer’s Internet Protocol addresses used for access or administration and the date and time of each access or administrative action.
(5) Maintain for seven years and provide to the Attorney General, upon request, appropriate records of actions taken under this section, including policies and procedures put into effect.
(6) Implement the capability to promptly enact a full shutdown of any resources being used to train or operate models under the customer’s control.
(b) A person that operates a computing cluster shall consider industry best practices and applicable guidance from the U.S. Artificial Intelligence Safety Institute, National Institute of Standards and Technology, and other reputable standard-setting organizations.
(c) In complying with the requirements of this section, a person that operates a computing cluster may impose reasonable requirements on customers to prevent the collection or retention of personal information that the person that operates a computing cluster would not otherwise collect or retain, including a requirement that a corporate customer submit corporate contact information rather than information that would identify a specific individual.
Here are punishment:
22606.
(a) The Attorney General may bring a civil action for a violation of this chapter and to recover all of the following:
(1) For a violation that causes death or bodily harm to another human, harm to property, theft or misappropriation of property, or that constitutes an imminent risk or threat to public safety that occurs on or after January 1, 2026, a civil penalty in an amount not exceeding 10 percent of the cost of the quantity of computing power used to train the covered model to be calculated using average market prices of cloud compute at the time of training for a first violation and in an amount not exceeding 30 percent of that value for any subsequent violation.
(2) For a violation of Section 22607 that would constitute a violation of the Labor Code, a civil penalty specified in subdivision (f) of Section 1102.5 of the Labor Code.
(3) For a person that operates a computing cluster for a violation of Section 22604, for an auditor for a violation of paragraph (6) of subdivision (e) of Section 22603, or for an auditor who intentionally or with reckless disregard violates a provision of subdivision (e) of Section 22603 other than paragraph (6) or regulations issued by the Government Operations Agency pursuant to Section 11547.6 of the Government Code, a civil penalty in an amount not exceeding fifty thousand dollars ($50,000) for a first violation of Section 22604, not exceeding one hundred thousand dollars ($100,000) for any subsequent violation, and not exceeding ten million dollars ($10,000,000) in the aggregate for related violations.
(4) Injunctive or declaratory relief.
(5) (A) Monetary damages.
(B) Punitive damages pursuant to subdivision (a) of Section 3294 of the Civil Code.
(6) Attorney’s fees and costs.
(7) Any other relief that the court deems appropriate.
(b) In determining whether the developer exercised reasonable care as required in Section 22603, all of the following considerations are relevant but not conclusive:
(1) The quality of a developer’s safety and security protocol.
(2) The extent to which the developer faithfully implemented and followed its safety and security protocol.
(3) Whether, in quality and implementation, the developer’s safety and security protocol was inferior, comparable, or superior to those of developers of comparably powerful models.
(4) The quality and rigor of the developer’s investigation, documentation, evaluation, and management of risks of critical harm posed by its model.
(c) (1) A provision within a contract or agreement that seeks to waive, preclude, or burden the enforcement of a liability arising from a violation of this chapter, or to shift that liability to any person or entity in exchange for their use or access of, or right to use or access, a developer’s products or services, including by means of a contract of adhesion, is void as a matter of public policy.
(2) A court shall disregard corporate formalities and impose joint and several liability on affiliated entities for purposes of effectuating the intent of this section to the maximum extent allowed by law if the court concludes that both of the following are true:
(A) The affiliated entities, in the development of the corporate structure among the affiliated entities, took steps to purposely and unreasonably limit or avoid liability.
(B) As the result of the steps described in subparagraph (A), the corporate structure of the developer or affiliated entities would frustrate recovery of penalties, damages, or injunctive relief under this section.
(d) Penalties collected pursuant to this section by the Attorney General shall be deposited into the Public Rights Law Enforcement Special Fund established pursuant to Section 12530 of the Government Code.
(e) This section does not limit the application of other laws.
But here is the worst part (the definition of a computing cluster):
(d) “Computing cluster” means a set of machines transitively connected by data center networking of over 100 gigabits per second that has a theoretical maximum computing capacity of at least 10^20 integer or floating-point operations per second and can be used for training artificial intelligence.
Note, a
compute limit not a monetary limit, again! People claiming that it just affects big moneyed companies in their commentary are clearly lying. A seed stage startup could easily be affected, and would raise their costs by 50% (for lawyers for compliance).
en.wikipedia.org
Right now 10^20 FLOPS would be $1M (at $0.01 per 10^9 FLOPS). Next year, we'd expect $0.5M, in 2026 $250K, etc. One could expect, by the end of the decade, being able to purchase the equivalent on a loan with similar terms to car.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
There are some (pseudo)positive steps:
The word "
materially" shows up in a lot places. Hopefully, those are steps away from thought crime. I assume "materially" means doing something more than thinking about things with the aid of a computer. However, the "
materially" is often counteracted by "
may" or "
enable", bringing us back to thought crime territory.
They expanded the Board of Frontier Models to have more people, and not have conflicts of interest which I think is a good thing. But the open-source community representation is quite wanting. There are also, in Sec 4, 11547.6 (e) (1) (b) the use of the phrase "
and government entities, including from the open-source community." If this isn't signalling military-industrial complex capture of open source, it is a very unfortunate mistake.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finally, there
was something mainly positive in the amendments to the bill.
(i) “Developer” means a person that performs the initial training of a covered model either by training a model using a sufficient quantity of computing power, power and cost, or by fine-tuning an existing covered model using sufficient or covered model derivative using a quantity of computing power pursuant to and cost greater than the amount specified in subdivision (e).
It could be a little clearer still, but I'll take it.
If only the other part talking about limits could be similarly clearly talking about costs (ideally even more clearly).