Skip to content

Detect circular references for library and free functions #10228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 30, 2021
Merged

Conversation

Marenz
Copy link
Contributor

@Marenz Marenz commented Nov 9, 2020

Fixes #9565
Fixes #10457
Fixes #11128

@Marenz Marenz force-pushed the issue-9565 branch 2 times, most recently from 8f39a7e to f696907 Compare November 9, 2020 12:21
Copy link
Member

@hrkrshnn hrkrshnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Some minor comments.

@hrkrshnn hrkrshnn marked this pull request as draft November 24, 2020 10:07
Copy link
Member

@cameel cameel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly style issues.

And yeah, now I see how it won't work without the call graph. It does not cover transitive dependencies where a function calls a function.

@cameel
Copy link
Member

cameel commented Feb 23, 2021

Function call graph (#10973) has been merged so I guess we can now revive this PR. @Marenz, will you be working it do you want someone else to take over?

@Marenz
Copy link
Contributor Author

Marenz commented Feb 23, 2021

I can do it, looking forward to using the call graph!

@Marenz Marenz force-pushed the issue-9565 branch 2 times, most recently from d569f78 to 75e7e77 Compare March 3, 2021 15:54
@Marenz Marenz marked this pull request as ready for review March 3, 2021 16:38
@leonardoalt leonardoalt requested review from cameel and hrkrshnn March 4, 2021 10:52
@cameel
Copy link
Member

cameel commented Mar 4, 2021

Is this actually up for review now? I think @Marenz said yesterday that he's still working on something here so I decided to wait and watch and in the meantime I started reviewing keccak instead :)

@leonardoalt
Copy link
Member

@cameel ah ok, not sure, I based the requests on Marenz marked this pull request as ready for review 18 hours ago above

@Marenz
Copy link
Contributor Author

Marenz commented Mar 4, 2021

Is this actually up for review now?

No, this is ready now

@Marenz Marenz force-pushed the issue-9565 branch 4 times, most recently from f8bd508 to 1945da9 Compare March 4, 2021 15:25
@@ -379,17 +379,21 @@ void NameAndTypeResolver::linearizeBaseContracts(ContractDefinition& _contract)
// "push_front" has the effect that bases mentioned later can overwrite members of bases
// mentioned earlier
input.back().push_front(base);
_contract.annotation().contractDependencies.emplace(base, baseSpecifier.get());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually still need this, or are the dependencies from the call graph enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have

contract A { function f() public { new B(); } }
contract B is A {}

this will be caught through the call graph.
A special case that changes is if the base function is not executed when the derived is the "most derived":

contract A { function f() public virtual { new B(); } }
contract B is A {
 function f() public override {  }
}

But in that case, it is actually possible for A to create B.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contractDependencies looks like it only stores information about the base contracts that the derived contract inherits from. The call graph does not have that info. At least not full info. The creation call graph does have edges from Entry to base constructors but only if these constructors are explicitly defined.

I also see that it contains contracts used in member access expressions. The call graph tracks function calls but completely ignores contracts used in such expressions.

And the annotation is used not only for detecting cycles. For example CompilerStack::compileContract() uses it to determine which contracts to compile.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think there are some cases where adding stuff to contractDependencies is now redundant. For example #10228 (comment).

Though maybe not. The call graph does not take into account stuff that is unreachable. For example if there's a creation cycle caused by unused internal functions, it won't be detected.

contract C { function foo() internal { new D(); } }
contract D { function foo() internal { new C(); } }

Currently it causes an error:

Error: Circular reference for contract creation (cannot create instance of derived or same contract).

But the graph won't report C and D in createdContracts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "used in member access expressions"?
The reason CompilerStack::compileContract uses contractDependencies is because new C needs to have C available, and that's the only reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that the TypeChecker was adding contracts used in member access expressions to contractDependencies and function call graph does not add them to createdContracts so it could not be used as a replacement.

But well, I see your point now. It should be possible to generalize createdContracts to just store info about dependencies rather than just about new expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a test and if we don't add them here, our tests all still work, EXCEPT the astjson tests who know are missing their contract dependency information for their field contractDependencies. I'll look into the callgraph and check if we can make it do that somehow..

Copy link
Contributor Author

@Marenz Marenz Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I misinterpreted a previous error wrongly and it seems this does indeed break tests involving libraries like this one:

library L {
    function f() internal {
        new C();
    }
}

contract D {
    function f() public {
        L.f();
    }
}
contract C {
    constructor() { new D(); }
}

The problem is that we never detect the new C(); in L.f() and never fill the dependency C for L..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the call graph does not properly handle internal library functions? I think there should not be a dependency from L to C but rather from D to C, because the code of L.f is included directly in D.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would also explain the problem I am seeing. I'll see if I can fix my problem with that information..

@@ -379,17 +379,21 @@ void NameAndTypeResolver::linearizeBaseContracts(ContractDefinition& _contract)
// "push_front" has the effect that bases mentioned later can overwrite members of bases
// mentioned earlier
input.back().push_front(base);
_contract.annotation().contractDependencies.emplace(base, baseSpecifier.get());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

contractDependencies looks like it only stores information about the base contracts that the derived contract inherits from. The call graph does not have that info. At least not full info. The creation call graph does have edges from Entry to base constructors but only if these constructors are explicitly defined.

I also see that it contains contracts used in member access expressions. The call graph tracks function calls but completely ignores contracts used in such expressions.

And the annotation is used not only for detecting cycles. For example CompilerStack::compileContract() uses it to determine which contracts to compile.

);

for (auto const& [contract, referencee]: joined)
if (contract != &_contract)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually possible? I don't think a contract can create its own instance? Maybe this should be an assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm are you sure that's not possible? Why wouldn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't new C syntax require the bytecode of C to be embedded in the bytecode of the contract that wants to create it? And we're still compiling C.

If I try to make C create its own instance I get and error that says:

Error: Circular reference for contract creation (cannot create instance of derived or same contract).

This is from:

contract C {
    function foo() public {
        new C();
    }
}

// TypeError 4224: (165-188): Circular reference for contract code access.
// TypeError 4224: (271-293): Circular reference for contract code access.
// TypeError 4224: (381-404): Circular reference for contract code access.
// TypeError 4224: (491-513): Circular reference for contract code access.
// TypeError 7813: (165-188): Circular reference found.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test should be refactored so that each function is in a separate contract. Otherwise it's not clear if all of these cases still report an error or just some.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, creationBase() and runtimeBase() don't create an error anymore here. However, even when in different contracts, this results in only one error, even though both Test calls would cause one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am at this point not sure if that's okay

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, even when in different contracts, this results in only one error, even though both Test calls would cause one

I think this might actually be an example of #10228 (comment). I suspect that for both cycles the node added to m_foundCycles is Base and the second one overwrites the first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll investigate that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reversed the contract <-> referencee now but this contract still only gets one error, no matter in how many contracts it is split... I didn't investigate further yet.

@@ -2,4 +2,6 @@ contract A { function f() public { new B(); } }
contract B { function f() public { new C(); } }
Copy link
Member

@cameel cameel Mar 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more test ideas:

  • Multiple creation cycles sharing one node (to test the case I mentioned in one of the comments where the error is reported only for that shared node and not separately for each contract).
  • Creating the contract with new in its own constructor (I think we only have that for an ordinary function).
  • A situation where two contracts just reference each other's names without doing anything (i.e. f() public { C; }). Currently this is allowed.
  • Same but with type(C).
  • Two contracts calling each other without creating an actual dependency cycle (we might already have a test for this):
    contract C { function foo(D _d) public { _d.foo(this); } }
    contract D { function foo(C _c) public { _c.foo(this); } }
    Currently this works.
  • Same for libraries
    library L1 { function foo() internal { L2.foo(); } }
    library L2 { function foo() internal { L1.foo(); } }
    Currently this works.
  • Creation cycles involving both new and inheritance. For example:
    • C is D, E is D, E does new C
    • C does new A, A does new D, D is C.
  • Creation cycle caused by unused internal functions (which are ignored by the call graph):
    contract C { function foo() internal { new D(); } }
    contract D { function foo() internal { new C(); } }
    Currently this is an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if the libraries case should work. In this new PR it is an error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • C is D, E is D, E does new C — This is currently no error. Should it be?
  • C does new A, A does new D, D is C. — This does produce an error.

Copy link
Member

@cameel cameel Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if the libraries case should work. In this new PR it is an error.

I think it should because you do not need to compile one first to be able to compile the other. Both libraries have their own copies of both functions compiled into them.

  • C is D, E is D, E does new C — This is currently no error. Should it be?

Right, I think it's fine that it compiles because if we compile C first, E can use its bytecode just fine.

I'd still add it as a test case but the situation I had in mind is actually this: C is D, E is D, C does new E

  • C does new A, A does new D, D is C. — This does produce an error.

Great. This is what I would expect.

Comment on lines 157 to 161
/// List of contracts that this contract or function creates, i.e. contracts which need to be compiled first.
/// For ContractDefinitions: Also includes all contracts from linearizedBaseContracts.
/// Only relevant for contracts or free functions
std::map<ContractDefinition const*, ASTNode const*> contractDependencies;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we be more specific than ASTNode? I think it can only be NewExpression or InheritanceSpecifier. Using a variant here might save us some assertions.

@Marenz
Copy link
Contributor Author

Marenz commented Mar 10, 2021

I pushed an update for this PR that takes care of ~80% of the feedback here. Still a WIP though. Everything marked as resolved should be fine though.

"Circular reference for contract code access."
);
}
if (m_currentContract || m_currentFreeFunction)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be moved inside the call graph analysis.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accessing another contract's code is the same as using new OtherContract.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so adding this to the _memberAccess visitor and removing the part here should be enough?

	TypePointer exprType = _memberAccess.expression().annotation().type;
	ASTString const& memberName = _memberAccess.memberName();

	if (auto magicType = dynamic_cast<MagicType const*>(exprType))
		if (magicType->kind() == MagicType::Kind::MetaType && (
			memberName == "creationCode" || memberName == "runtimeCode"
		))
		{
			ContractType const& accessedContractType = dynamic_cast<ContractType const&>(*magicType->typeArgument());
			m_graph.createdContracts.emplace(&accessedContractType, &_memberAccess);
		}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cameel and here we already have the case where it's more than just a new expression..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'd personally just use a variant. It's not really that much more intrusive than ASTNode * because with it you just trade get<T> for dynamic_cast<T>. But that's just my preference. ASTNode * is still a valid choice so if you think it's better suited here I'm fine that that too. I'll just probably be suggesting extra assertions because of that :)

@@ -0,0 +1,42 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's from the "fix macos" commit which is temporary. will be properly integrated now


for (ContractDefinition const* contractDefinition: ASTNode::filteredNodes<ContractDefinition>(source->ast->nodes()))
{
// Ignore libraries as they are tested as part of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do that, libraries can also be deployed in isolation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case we have to change the function call graph to handle internal functions as well when it's a library, even if they are not called...

I would suggest to do that in an extra PR and leave it like this in this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internal library functions that are not called will be ignored, but there are also public library functions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call graph is working perfectly fine. I'll just remove this condition here. Also added some more tests.

if (!cycle)
continue;

ASTNode const* referencee = contractDefinition->annotation().contractDependencies.at(cycle);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have a cycle of the form
a -> b -> c -> d -> c?
Then the return vaule of the cycleDetector will be c, which is not a dependency of a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is okay (I also tested your example just now) as the cycle Detector returns only and always the vertex that has a distance of 1 (depth = 1) to the originating vertex

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, this is not how CycleDetector should work. I will double-check, because I thought it works correctly with constants.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the code in line 64 of Algorithms.h does that. Ok, let's just leave it like that.

@chriseth
Copy link
Contributor

Can you add a semantic test that shows that having a "fake cyclic dependency" through a function in a base contract that is never called actually works?

@Marenz
Copy link
Contributor Author

Marenz commented Mar 25, 2021

Can you add a semantic test that shows that having a "fake cyclic dependency" through a function in a base contract that is never called actually works?

You mean an internal function correct?

@chriseth
Copy link
Contributor

Yes, we discussed such examples in the context of whether or not the base contract should be automatically added to the dependencies or not.

@Marenz
Copy link
Contributor Author

Marenz commented Mar 29, 2021

Added requested test case:

contract C
{
	// Internal uncalled function should not cause an cyclic dep. error
	function foo() internal { new D(); }
}

contract D is C
{
}

@chriseth
Copy link
Contributor

I think we should also add a more complicated case:
There is another public function in C that calls foo, but it is overwritten in D by an empty function.

@Marenz
Copy link
Contributor Author

Marenz commented Mar 29, 2021

That would still cause an error though if C is not abstract, that is.

@chriseth
Copy link
Contributor

Needs rebase.

Changelog.md Outdated
@@ -9,6 +9,7 @@ Compiler Features:
* SMTChecker: Report out of bounds index access for arrays and fixed bytes.
* Standard JSON: Model checker option ``settings.modelChecker.targets`` also accepts ``outOfBounds``.
* Yul Optimizer: Added a new step FunctionSpecializer, that specializes a function with its literal arguments.
* References Resolver: Detect circular references to other contracts across libraries and free functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not really the references resolver. Cannot think of a better word now. We can just call it "Analysis" if we do not find a better word.

Also please use "code references". References is too general.

@chriseth chriseth force-pushed the issue-9565 branch 2 times, most recently from 4fde2a1 to 0c0081a Compare March 30, 2021 19:04
chriseth
chriseth previously approved these changes Mar 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants