Add NTLM support for HTTP(s) servers and proxies by ethomson · Pull Request #5052 · libgit2/libgit2

ethomson · 2019-04-16T10:50:45Z

When we removed libcurl, we inadvertantly removed support for NTLM connections to proxies; it was supported automatically, which I don't think any of the contributors realized.

This adds NTLM support by including https://github.com/ethomson/ntlmclient, which is a pure C NTLM client implementation that was built to live nicely within the libgit2 codebase.

In addition, it does some refactoring to the HTTP codebase to support NTLM connections to both endpoint servers and proxies.

ethomson · 2019-05-12T21:11:07Z

Rebased on master.

ethomson · 2019-05-24T09:58:08Z

What would help review this? Should I break out the necessary http client refactoring so that one can review the necessary bits and then follow up with an addition of the NTLM library?

tiennou

Sorry, I had wanted to review sooner that but got caught up onto things. So, here's a "sparse" review, in that I've read everything, but didn't look hard for bugs, as most of it seems NFC and/or (nice) cleanups.

tiennou · 2019-05-24T22:04:56Z

+
+static int ntlm_init_context(
+	http_auth_ntlm_context *ctx,
+	const git_net_url *url)


This is out of order, as the git_net_url rename commit is later in the patch series. Just a heads up.

I think the GitHub UI may be showing commits out of order, 8497027 (the rename to git_net_url) is the first commit in this series.

tiennou · 2019-05-24T22:05:59Z

+	    git_buf_oom(&host) ||
+	    git_buf_oom(&port) ||
+	    git_buf_oom(&path) ||
+		git_buf_oom(&query) ||


Nitpick: indent.

tiennou · 2019-05-24T22:08:27Z


-void test_online_clone__url_with_no_path_returns_EINVALIDSPEC(void)
-{
-	cl_git_fail_with(git_clone(&g_repo, "http://github.com", "./foo", &g_options),


tiennou · 2019-05-24T22:19:02Z

-			return -1;
-		}
-
-		git_buf_dispose(&request);


Nitpick: I went looking for a realloc call and found none. I'm not sure I got how this makes it "not realloc", though I understood the intent is to remove that dispose call ⬆️ so the request isn't lost. Maybe ?

Sorry, I could have been more clear. We're not explicitly calling realloc, but we are doing that implicitly, since we were defining the request buffer inside the replay block. So we would create a new buffer for every replay, which is not necessary. Instead we can clear the existing buffer and just put the new URI inside of it.

tiennou · 2019-05-24T22:21:18Z

@@ -0,0 +1,33 @@
+/*


Skipping external dependency code because of sheer laziness. Would you like it reviewed anyway ? (I remember taking a look a few months ago and didn't see anything that stood out).

I'm pretty comfortable with it. The functional portions are a reasonably direct port of Microsoft's code and from a technical perspective, I've run this through fuzzers and memory checkers galore. So although there may be (and I'm sure there still will be) bugs, I think we'll need to ship this to find them.

tiennou · 2019-05-27T20:16:55Z

 	if (parser->status_code == 407 && get_verb == s->verb)
 		return on_auth_required(&t->proxy.cred,
 		    parser,
+			&t->proxy,


Nitpick: strange indent? Not sure who's wrong though 😅.

tiennou · 2019-05-27T20:17:14Z

 	if (parser->status_code == 401 && get_verb == s->verb)
 		return on_auth_required(&t->server.cred,
 		    parser,
+			&t->server,


tiennou · 2019-05-27T20:23:04Z

-	}
-
-	return error;
+	return git_net_url_parse(&t->proxy.url, t->proxy_opts.url);


tiennou · 2019-05-27T20:23:46Z

+		}
+	}
+
+    /* Enforce a reasonable cap on the number of replays */


Nitpick: indent.

ethomson · 2019-06-08T17:57:12Z

/rebuild

libgit2-azure-pipelines · 2019-06-08T17:57:19Z

Okay, @ethomson, I started to rebuild this pull request as build #2017.

"Connection data" is an imprecise and largely incorrect name; these structures are actually parsed URLs. Provide a parser that takes a URL string and produces a URL structure (if it is valid). Separate the HTTP redirect handling logic from URL parsing, keeping a `gitno_connection_data_handle_redirect` whose only job is redirect handling logic and does not parse URLs itself.

There's no reason a git repository couldn't be at the root of a server, and URLs should have an implicit path of '/' when one is not specified.

We did not properly support default credentials for proxies, only for destination servers. Refactor the credential handling to support sending either username/password _or_ default credentials to either the proxy or the destination server. This actually shares the authentication logic between proxy servers and destination servers. Due to copy/pasta drift over time, they had diverged. Now they share a common logic which is: first, use credentials specified in the URL (if there were any), treating empty username and password (ie, "http://:@foo.com/") as default credentials, for compatibility with git. Next, call the credential callbacks. Finally, fallback to WinHTTP compatibility layers using built-in authentication like we always have. Allowing default credentials for proxies requires moving the security level downgrade into the credential setting routines themselves. We will update our security level to "high" by default which means that we will never send default credentials without prompting. (A lower setting, like the WinHTTP default of "medium" would allow WinHTTP to handle credentials for us, despite what a user may have requested with their structures.) Now we start with "high" and downgrade to "low" only after a user has explicitly requested default credentials.

Update our CI tests to start a proxy that requires NTLM authentication; ensure that our WIndows HTTP client can speak NTLM.

Increase the permissible replay count; with multiple-step authentication schemes (NTLM, Negotiate), proxy authentication and redirects, we need to be mindful of the number of steps it takes to get connected. 7 seems high but can be exhausted quickly with just a single authentication failure over a redirected multi-state authentication pipeline.

We cannot examine the keep-alive status of the http parser in `http_connect`; it's too late and the critical information about whether keep-alive is supported has been destroyed. Per the documentation for `http_should_keep_alive`: > If http_should_keep_alive() in the on_headers_complete or > on_message_complete callback returns 0, then this should be > the last message on the connection. Query then and set the state.

Some authentication mechanisms (like HTTP Basic and Digest) have a one-step mechanism to create credentials, but there are more complex mechanisms like NTLM and Negotiate that require challenge/response after negotiation, requiring several round-trips. Add an `is_complete` function to know when they have round-tripped enough to be a single authentication and should now either have succeeded or failed to authenticate.

When we get an authentication failure, we must consume the entire body of the response. If we only read half of the body (on the assumption that we can ignore the rest) then we will never complete the parsing of the message. This means that we will never set the complete flag, and our replay must actually tear down the connection and try again. This is particularly problematic for stateful authentication mechanisms (SPNEGO, NTLM) that require that we keep the connection alive. Note that the prior code is only a problem when the 401 that we are parsing is too large to be read in a single chunked read from the http parser. But now we will continue to invoke the http parser until we've got a complete message in the authentication failed scenario. Note that we need not do anything with the message, so when we get an authentication failed, we'll stop adding data to our buffer, we'll simply loop in the parser and let it advance its internal state.

We must always consume the full parser body if we're going to keep-alive. So in the authentication failure case, continue advancing the http message parser until it's complete, then we can retry the connection. Not doing so would mean that we have to tear the connection down and start over. Advancing through fully (even though we don't use the data) will ensure that we can retry a connection with keep-alive.

Ensure that the server supports the particular credential type that we're specifying. Previously we considered credential types as an input to an auth mechanism - since the HTTP transport only supported default credentials (via negotiate) and username/password credentials (via basic), this worked. However, if we are to add another mechanism that uses username/password credentials, we'll need to be careful to identify the types that are accepted.

Include https://github.com/ethomson/ntlmclient as a dependency.

A "connection" to a server is transient, and we may reconnect to a server in the midst of authentication failures (if the remote indicates that we should, via `Connection: close`) or in a redirect.

Hold an individual authentication context instead of trying to maintain all the contexts; we can select the preferred context during the initial negotiation. Subsequent authentication steps will re-use the chosen authentication (until such time as it's rejected) instead of trying to manage multiple contexts when all but one will never be used (since we can only authenticate with a single mechanism at a time.) Also, when we're given a 401 or 407 in the middle of challenge/response handling, short-circuit immediately without incrementing the retry count. The multi-step authentication is expected, and not a "retry" and should not be penalized as such. This means that we don't need to keep the contexts around and ensures that we do not unnecessarily fail for too many retries when we have challenge/response auth on a proxy and a server and potentially redirects in play as well.

For request-based authentication mechanisms (Basic, Digest) we should keep the authentication context alive across socket connections, since the authentication headers must be transmitted with every request. However, we should continue to remove authentication contexts for mechanisms with connection affinity (NTLM, Negotiate) since we need to reauthenticate for every socket connection.

Instead of using `is_complete` to decide whether we have connection or request affinity for authentication mechanisms, set a boolean on the mechanism definition itself.

We stop the read loop when we have read all the data. We should also consider the server's feelings. If the server hangs up on us, we need to stop our read loop. Otherwise, we'll try to read from the server - and fail - ad infinitum.

When we have a keep-alive connection to the server, that server may legally drop the connection for any reason once a successful request and response has occurred. It's common for servers to drop the connection after some amount of time or number of requests have occurred.

When we're issuing a CONNECT to a proxy, we expect to keep-alive to the proxy. However, during authentication negotiations, the proxy may close the connection. Reconnect if the server closes the connection.

When we send HTTP credentials but the server rejects them, tear down the authentication context so that we can start fresh. To maintain this state, additionally move all of the authentication handling into `on_auth_required`.

This appears to be the result of Atom's git-utils package vendoring libgit2 during 5.6.0 and 5.6.1 at an incorrect point in libgit2's history, where there was an ABI change where gitno_connection_data was renamed to get_net_url. This landed on June 11th, per [this pull request](libgit2/libgit2#5052) but doesn't account for subsequent commits to src/net.c, realistically they should sort out their submodule or not rely on it in the first place. This new git-utils package was pulled in specifically for 1.41.0 and thus reverting it 5.5.0 from 1.40.0 resolves the issue. Alongside this change I also went ahead and removed some long unnecessary node engine min / max version comparisons.

ethomson force-pushed the ethomson/netrefactor branch from d12600e to aaa7459 Compare May 12, 2019 21:10

ethomson mentioned this pull request May 15, 2019

implement NTLM authentication #4389

Closed

tiennou reviewed May 27, 2019

View reviewed changes

ethomson force-pushed the ethomson/netrefactor branch from aaa7459 to 4154c2e Compare June 5, 2019 17:03

ethomson mentioned this pull request Jun 8, 2019

Handling of port component in URLs (with schema) differs from git #5100

Closed

ethomson added 21 commits June 10, 2019 19:58

network: don't add arbitrary url rules

757411a

There's no reason a git repository couldn't be at the root of a server, and URLs should have an implicit path of '/' when one is not specified.

http: support https for proxies

ee3d35c

ci: test NTLM proxy authentication on Windows

1ef77e3

Update our CI tests to start a proxy that requires NTLM authentication; ensure that our WIndows HTTP client can speak NTLM.

ci: enable SKIP_OFFLINE_TESTS for windows

ad5419b

ci: enable all proxy tests

7912db4

http: don't realloc the request

e87f912

ntlm: add ntlmclient as a dependency

a7f65f0

Include https://github.com/ethomson/ntlmclient as a dependency.

http: provide an NTLM authentication provider

3192e3c

http: don't reset replay count after connection

1071852

A "connection" to a server is transient, and we may reconnect to a server in the midst of authentication failures (if the remote indicates that we should, via `Connection: close`) or in a redirect.

http: don't set the header in the auth token

6d931ba

http: teach auth mechanisms about connection affinity

539e629

Instead of using `is_complete` to decide whether we have connection or request affinity for authentication mechanisms, set a boolean on the mechanism definition itself.

ethomson added 5 commits June 10, 2019 19:58

ci: test NTLM proxy authentication on Unix

4c2ca1b

http: stop on server EOF

9af1de5

We stop the read loop when we have read all the data. We should also consider the server's feelings. If the server hangs up on us, we need to stop our read loop. Otherwise, we'll try to read from the server - and fail - ad infinitum.

http: reconnect to proxy on connection close

005b5bc

When we're issuing a CONNECT to a proxy, we expect to keep-alive to the proxy. However, during authentication negotiations, the proxy may close the connection. Reconnect if the server closes the connection.

http: free auth context on failure

7ea8630

When we send HTTP credentials but the server rejects them, tear down the authentication context so that we can start fresh. To maintain this state, additionally move all of the authentication handling into `on_auth_required`.

ethomson force-pushed the ethomson/netrefactor branch from 4154c2e to 7ea8630 Compare June 10, 2019 18:58

ethomson merged commit 110b589 into master Jun 11, 2019

ethomson mentioned this pull request Jun 11, 2019

HTTPS proxy support with curl #4515

Closed

hackhaslam mentioned this pull request Aug 29, 2019

ntlm: fix failure to find openssl headers #5216

Merged

ethomson deleted the ethomson/netrefactor branch February 2, 2020 13:08

snyk-bot mentioned this pull request Feb 23, 2020

[Snyk] Upgrade nodegit from 0.4.1 to 0.26.4 saurabharch/Breezeblocks#1

Open

snyk-bot mentioned this pull request Apr 22, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 aminatakonate000/Graviton-App#4

Open

snyk-bot mentioned this pull request May 5, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 Barnstorm-Online/ngp-openapi-generator#1

Open

Conversation

ethomson commented Apr 16, 2019

Uh oh!

ethomson commented May 12, 2019

Uh oh!

ethomson commented May 24, 2019

Uh oh!

tiennou left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ethomson commented Jun 8, 2019

Uh oh!

libgit2-azure-pipelines Bot commented Jun 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants