Google收到的每个请求都会打开多少个套接字？-HowmanysocketsdoGoogleopenforeveryrequestitreceives?

作者：白羊蓝色雨线 | 来源：互联网 | 2023-06-30 10:38

Thefollowingismyrecentinterviewexperiencewithareputednetworksoftwarecompany.Iwasasked

The following is my recent interview experience with a reputed network software company. I was asked questions interconnecting TCP level and web-requests and hence confusing me a lot. I really would like to know what expert opinions on the answers. It is not just about interview but fundamental understanding of how networking work (or how Application layer and Transport layer cross-talk, if at all they do).

以下是我最近与知名网络软件公司的面试经历。我被问到互连TCP级别和Web请求的问题，因此让我很困惑。我真的很想知道答案的专家意见。这不仅仅是关于访谈，而是对网络工作方式的基本理解（或者应用层和传输层如何交叉，如果他们做的话）。

Interviewer : Tell me the process that happens behind the scene when I open a browser and type google.com in it.
采访者：告诉我当我打开浏览器并在其中输入google.com时，幕后发生的过程。

Me : The first thing that happens is a socket is created which is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL}. The SRC-PORT number is a random number given by browser. Usually TCP/IP connection protocol (three way handshake is established). Now the both client (my browser) and server (google) are ready to handle requests. (TCP connection is established)
我：首先发生的是创建一个由{SRC-IP，SRC-PORT，DEST-IP，DEST-PORT，PROTOCOL}标识的套接字。 SRC-PORT号码是浏览器给出的随机数。通常是TCP / IP连接协议（建立三次握手）。现在，客户端（我的浏览器）和服务器（谷歌）都准备好处理请求。（建立TCP连接）
Interviewer: wait, when does the name resolution happens?
采访者：等一下，名称解析何时发生？

Me: yep, I am sorry. It should have happened before even the socket is created. DNS name resolve happens first to get the IP address of google to reach at.
我：是的，对不起。它应该在创建套接字之前发生。 DNS名称解析首先发生以获取谷歌的IP地址。
Interviewer : Is a socket created for DNS name resolution?
访问者：是否为DNS名称解析创建了一个套接字？

Me : hmm, Iactually do not know. But all I know DNS name resolution is a connection-less. That is it not TCP but UDP. only a single request-response cycle happens. (so is a new socket created for DNS name resolution).
我：嗯，实际上不知道。但我所知道的DNS名称解析都是无连接的。那不是TCP而是UDP。只发生一个请求 - 响应周期。（这是为DNS名称解析创建的新套接字）。
Interviewer: google.com is open for other requests from other clients. so is establishing you connection with google is blocking other users?
采访者：google.com对其他客户的其他请求开放。所以建立你与谷歌的连接阻止其他用户？

Me: I am not sure how google handles this. But in a typical socket communication, it is blocking to a minimal extent.
我：我不确定谷歌如何处理这个问题。但是在典型的套接字通信中，它在最小程度上是阻塞的。
Interviewer : How do you think this can be handled?
采访者：您认为如何处理？

Me : I guess the process forks a new thread and creates a socket to handle my request. From now on, my socket endpoint of communication with google is this this child socket.
我：我猜这个过程会分叉一个新线程并创建一个套接字来处理我的请求。从现在开始，我与谷歌通信的套接字端点就是这个子套接字。
Interviewer: If that is the case, is this child socket’s port number different than the parent one.?
采访者：如果是这种情况，这个子套接字的端口号是否与父套接字不同。

Me: The parent socket is listening at 80 for new requests from clients. The child must be listening at a different port number.
我：父套接字在80处侦听来自客户端的新请求。孩子必须在不同的端口号上听。
Interviewer: how is your TCP connection maintained since your Dest-port number has changed. (That is the src-port number sent on google's packet) ?
采访者：由于您的Dest-port编号已更改，您的TCP连接如何维护。（那是google数据包发送的src-port号码）？

Me: The dest-port as what I see as client is always 80. when request is sent back, it also comes from port 80. I guess the OS/the parent process sets the src port back to 80 before it sends back the post.
我：我看作客户端的目标端口总是80.当请求被发送回来时，它也来自端口80.我想OS /父进程在发回邮件之前将src端口设置回80 。
Interviewer : how long is your socket connection established with google?
采访者：你的套接字连接与谷歌有多长时间了？

Me : If I don’t make any requests for a period of time, the main thread closes its child socket and any subsequent requests from me will be like am a new client.
我：如果我没有提出一段时间的请求，主线程关闭其子套接字，我的任何后续请求将像我的新客户端。
Interviewer : No, google will not keep a dedicated child socket for you. It handles your request and discards/recycles the sockets right away.
采访者：不，谷歌不会为你保留一个专用的子插座。它处理您的请求并立即丢弃/回收插座。
Interviewer: Although google may have many servers to serve requests, each server can have only one parent socket opened at port 80. The number of clients to access google webpage must be exceeding larger than the number of servers they have. How is this usually handled?
采访者：虽然谷歌可能有很多服务器来处理请求，但每个服务器只能在端口80打开一个父插槽。访问谷歌网页的客户端数量必须超过他们拥有的服务器数量。这通常如何处理？

Me : I am not sure how this is handled. I see the only way it could work is spawn a thread for each request it receives.
我：我不确定如何处理。我看到它可以工作的唯一方法是为它收到的每个请求生成一个线程。
Interviewer : Do you think the way Google handles is different from any bank website?
采访者：您认为Google处理的方式与任何银行网站不同吗？

*Me *: At the TCP-IP socket level, it should be similar. At the request level, slightly different because a session is maintained to keep state between requests in Banks websites.
* Me *：在TCP-IP套接字级别，它应该是类似的。在请求级别，略有不同，因为维护会话以保持Banks网站中的请求之间的状态。

If someone can give an explanation of each of the points, it will be very helpful for many beginners in Networking.

如果有人可以解释每个要点，那么对于网络中的许多初学者来说这将是非常有帮助的。

Thanks

谢谢

2 个解决方案

#1

How many sockets do Google open for every request it receives?

Google收到的每个请求都会打开多少个套接字？

This question doesn't actually appear in the interview, but it's in your title so I'll answer it. Google doesn't open any sockets at all. Your browser does that. Google accepts connections, in the form of new sockets, but I wouldn't describe that as 'opening' them.

这个问题实际上并没有出现在面试中，但它在你的标题中，所以我会回答它。谷歌根本不打开任何套接字。你的浏览器那样做了。 Google以新套接字的形式接受连接，但我不会将其描述为“打开”它们。

Interviewer : Tell me the process that happens behind the scene when I open a browser and type google.com in it.

采访者：告诉我当我打开浏览器并在其中输入google.com时，幕后发生的过程。

Me : The first thing that happens is a socket is created which is identified by {SRC-IP, SRC-PORT, DEST-IP, DEST-PORT, PROTOCOL}.

我：首先发生的是创建一个由{SRC-IP，SRC-PORT，DEST-IP，DEST-PORT，PROTOCOL}标识的套接字。

No. The connection is identified by the tuple. The socket is an endpoint to the connection.

不。连接由元组标识。套接字是连接的端点。

The SRC-PORT number is a random number given by browser.

SRC-PORT号码是浏览器给出的随机数。

No. By the operating system.

不。由操作系统。

Usually TCP/IP connection protocol (three way handshake is established). Now the both client (my browser) and server (google) are ready to handle requests. (TCP connection is established)

通常是TCP / IP连接协议（建立三次握手）。现在，客户端（我的浏览器）和服务器（谷歌）都准备好处理请求。（建立TCP连接）

Interviewer: wait, when does the name resolution happens?

采访者：等一下，名称解析何时发生？

Me: yep, I am sorry. It should have happened before even the socket is created. DNS name resolve happens first to get the IP address of google to reach at.

我：是的，对不起。它应该在创建套接字之前发生。 DNS名称解析首先发生以获取谷歌的IP地址。

Interviewer : Is a socket created for DNS name resolution?

访问者：是否为DNS名称解析创建了一个套接字？

Me : hmm, Iactually do not know. But all I know DNS name resolution is a connection-less. That is it not TCP but UDP. only a single request-response cycle happens. (so is a new socket created for DNS name resolution).

我：嗯，实际上不知道。但我所知道的DNS名称解析都是无连接的。那不是TCP而是UDP。只发生一个请求 - 响应周期。（这是为DNS名称解析创建的新套接字）。

Any rationally implemented browser would delegate the entire thing to the operating system's Sockets library, whose internal functioning depends on the OS. It might look at an in-memory cache, a file, a database, an LDAP server, several things, before going out to a DNS server, which it can do via either TCP or UDP. It's not a great question.

任何合理实现的浏览器都会将整个事务委托给操作系统的套接字库，其内部功能取决于操作系统。在进入DNS服务器之前，它可能会查看内存缓存，文件，数据库，LDAP服务器，以及它可以通过TCP或UDP执行的操作。这不是一个好问题。

Interviewer: google.com is open for other requests from other clients. so is establishing you connection with google is blocking other users?

采访者：google.com对其他客户的其他请求开放。所以建立你与谷歌的连接阻止其他用户？

Me: I am not sure how google handles this. But in a typical socket communication, it is blocking to a minimal extent.

我：我不确定谷歌如何处理这个问题。但是在典型的套接字通信中，它在最小程度上是阻塞的。

Wrong again. It has very little to do with Google specifically. A TCP server has a separate socket per accepted connection, and any rationally constructed TCP server handles them completely independently, either via multithreading, muliplexed/non-blocking I/O, or asynchronous I/O. They don't block each other.

又错了。它与谷歌没什么关系。 TCP服务器每个接受的连接都有一个单独的套接字，任何合理构造的TCP服务器都可以通过多线程，多路复用/非阻塞I / O或异步I / O完全独立地处理它们。它们不会相互阻挡。

Interviewer : How do you think this can be handled?

采访者：您认为如何处理？

Me : I guess the process forks a new thread and creates a socket to handle my request. From now on, my socket endpoint of communication with google is this this child socket.

我：我猜这个过程会分叉一个新线程并创建一个套接字来处理我的请求。从现在开始，我与谷歌通信的套接字端点就是这个子套接字。

Threads are created, not 'forked'. Forking a process creates another process, not another thread. The socket isn't 'created' so much as accepted, and this would normally precede thread creation. It isn't a 'child' socket.

创建线程，而不是“分叉”。分叉进程会创建另一个进程，而不是另一个进程。套接字不是被“创建”的，而是通常在创建线程之前。它不是'儿童'插座。

Interviewer: If that is the case, is this child socket’s port number different than the parent one.?

访问者：如果是这种情况，这个子套接字的端口号是否与父套接字号不同。

Me: The parent socket is listening at 80 for new requests from clients. The child must be listening at a different port number.

我：父套接字在80处侦听来自客户端的新请求。孩子必须在不同的端口号上听。

Wrong again. The accepted socket uses the same port number as the listening socket, and the accepted socket isn't 'listening' at all, it is receiving and sending.

又错了。接受的套接字使用与侦听套接字相同的端口号，并且接受的套接字根本不是“侦听”，它正在接收和发送。

Interviewer: how is your TCP connection maintained since your Dest-port number has changed. (That is the src-port number sent on google's packet) ?

采访者：由于您的Dest-port编号已更改，您的TCP连接如何维护。（那是google数据包发送的src-port号码）？

Me: The dest-port as what I see as client is always 80. when request is sent back, it also comes from port 80. I guess the OS/the parent process sets the src port back to 80 before it sends back the post.

我：我看作客户端的目标端口总是80.当请求被发送回来时，它也来自端口80.我想OS /父进程在发回邮件之前将src端口设置回80 。

This question was designed to explore your previous wrong answer. Your continuation of your wrong answer is still wrong.

此问题旨在探索您之前的错误答案。你继续错误的答案仍然是错误的。

Interviewer : how long is your socket connection established with google?

采访者：你的套接字连接与谷歌有多长时间了？

Me : If I don’t make any requests for a period of time, the main thread closes its child socket and any subsequent requests from me will be like am a new client.

我：如果我在一段时间内没有提出任何请求，主线程会关闭它的子套接字，来自我的任何后续请求就像是一个新客户端。

Wrong again. You don't know anything about threads at Google, let alone which thread closes the socket. Either end can close the connection at any time. Most probably the server end will beat you to it, but it isn't set in stone, and neither is which if any thread will do it.

又错了。你对谷歌的线程一无所知，更不用说哪个线程关闭了套接字。任何一端都可以随时关闭连接。很可能服务器端会击败你，但它并不是一成不变的，如果任何线程都没有这样做的话。

Interviewer : No, google will not keep a dedicated child socket for you. It handles your request and discards/recycles the sockets right away.

采访者：不，谷歌不会为你保留一个专用的子插座。它处理您的请求并立即丢弃/回收插座。

Here the interviewer is wrong. He doesn't seem to have heard of HTTP keep-alive, or the fact that it is the default in HTTP 1.1.

面试官在这里错了。他似乎没有听说过HTTP keep-alive，或者它是HTTP 1.1中的默认值。

Interviewer: Although google may have many servers to serve requests, each server can have only one parent socket opened at port 80. The number of clients to access google webpage must be exceeding larger than the number of servers they have. How is this usually handled?

采访者：虽然谷歌可能有很多服务器来处理请求，但每个服务器只能在端口80打开一个父插槽。访问谷歌网页的客户端数量必须超过他们拥有的服务器数量。这通常如何处理？

Me : I am not sure how this is handled. I see the only way it could work is spawn a thread for each request it receives.

我：我不确定如何处理。我看到它可以工作的唯一方法是为它收到的每个请求生成一个线程。

Here you haven't answered the question at all. He is fishing for an answer about load-balancers or round-robin DNS or something in front of all those servers. However his sentence "the number of clients to access google webpage must be exceeding larger than the number of servers they have" has already been answered by the existence of what you are both incorrectly calling 'child sockets'. Again, not a great question, unless you've reported it inaccurately.

在这里你根本没有回答这个问题。他正在寻找关于负载均衡器或循环DNS或所有这些服务器前面的东西的答案。然而，他的句子“访问谷歌网页的客户数量必须超过他们拥有的服务器数量”已经通过你们错误地称为“子套接字”的存在来回答。同样，这不是一个很好的问题，除非你报告的不准确。

Interviewer : Do you think the way Google handles is different from any bank website?

采访者：您认为Google处理的方式与任何银行网站不同吗？

Me: At the TCP-IP socket level, it should be similar. At the request level, slightly different because a session is maintained to keep state between requests in Banks websites.

我：在TCP-IP套接字级别，它应该是类似的。在请求级别，略有不同，因为维护会话以保持Banks网站中的请求之间的状态。

You almost got this one right. HTTP sessions to Google exist, as well as to bank websites. It isn't much of a question. He should be asking for facts, not your opinion.

你几乎得到了这个。存在与Google的HTTP会话以及银行网站。这不是一个问题。他应该询问事实，而不是你的意见。

Overall, (a) you failed the interview completely, and (b) you indulged in far too much guesswork. You should have simply stated 'I don't know' and let the interview proceed to things that you do know about.

总的来说，（a）你完全没有通过面试，而且（b）你沉迷于太多的猜测。你应该简单地说“我不知道”并让面试继续你所知道的事情。

#2

-1

For point #4, here is how I understand it: if both ends of an end to end connection were the same as that of another socket, there would indeed be no way to distinguish both socket, but if a single end is not the same as that of the other socket, then it's easy to distinguish both. So there is not need to turn destination port 80 (the default) forth and back, since the source ports differ.

对于第4点，这是我理解的方式：如果端到端连接的两端与另一个套接字的两端相同，则确实无法区分两个套接字，但如果单个端不相同作为另一个套接字的那个，那么很容易区分两者。因此，不需要向前和向后转动目标端口80（默认值），因为源端口不同。

推荐阅读

ip
Thrift教程初级篇——RPC框架Thrift的安装环境变量配置与第一个实例

本文介绍了RPC框架Thrift的安装环境变量配置与第一个实例，讲解了RPC的概念以及如何解决跨语言、c++客户端、web服务端、远程调用等需求。Thrift开发方便上手快，性能和稳定性也不错，适合初学者学习和使用。 ... [详细]

蜡笔小新 2023-12-13 17:36:52
ip
利用Visual Basic开发SAP接口程序初探的方法与原理

本文介绍了利用Visual Basic开发SAP接口程序的方法与原理，以及SAP R/3系统的特点和二次开发平台ABAP的使用。通过程序接口自动读取SAP R/3的数据表或视图，在外部进行处理和利用水晶报表等工具生成符合中国人习惯的报表样式。具体介绍了RFC调用的原理和模型，并强调本文主要不讨论SAP R/3函数的开发，而是针对使用SAP的公司的非ABAP开发人员提供了初步的接口程序开发指导。 ... [详细]

蜡笔小新 2023-12-13 10:56:31
ip
如何使用Java获取服务器硬件信息和磁盘负载率

本文介绍了使用Java编程语言获取服务器硬件信息和磁盘负载率的方法。首先在远程服务器上搭建一个支持服务端语言的HTTP服务，并获取服务器的磁盘信息，并将结果输出。然后在本地使用JS编写一个AJAX脚本，远程请求服务端的程序，得到结果并展示给用户。其中还介绍了如何提取硬盘序列号的方法。 ... [详细]

蜡笔小新 2023-12-14 13:56:20
get
页面请求方法参数最长_关于 HTTP GET/POST 请求参数长度最大值的一个理解误区

http:my.oschina.netleejun2005blog136820刚看到群里又有同学在说HTTP协议下的Get请求参数长度是有大小限制的，最大不能超过XX ... [详细]

蜡笔小新 2023-12-13 19:20:03
ip
计算机存储系统的层次结构及其优势

本文介绍了计算机存储系统的层次结构，包括高速缓存、主存储器和辅助存储器三个层次。通过分层存储数据可以提高程序的执行效率。计算机存储系统的层次结构将各种不同存储容量、存取速度和价格的存储器有机组合成整体，形成可寻址存储空间比主存储器空间大得多的存储整体。由于辅助存储器容量大、价格低，使得整体存储系统的平均价格降低。同时，高速缓存的存取速度可以和CPU的工作速度相匹配，进一步提高程序执行效率。 ... [详细]

蜡笔小新 2023-12-13 17:32:41
get
Web学习历程记录（七）——Tomcat基本概念和配置

本文介绍了Web学习历程记录中关于Tomcat的基本概念和配置。首先解释了Web静态Web资源和动态Web资源的概念，以及C/S架构和B/S架构的区别。然后介绍了常见的Web服务器，包括Weblogic、WebSphere和Tomcat。接着详细讲解了Tomcat的虚拟主机、web应用和虚拟路径映射的概念和配置过程。最后简要介绍了http协议的作用。本文内容详实，适合初学者了解Tomcat的基础知识。 ... [详细]

蜡笔小新 2023-12-13 17:08:24
ip
计算机网络初识及通信流程分析

本文介绍了计算机网络的定义和通信流程，包括客户端编译文件、二进制转换、三层路由设备等。同时，还介绍了计算机网络中常用的关键词，如MAC地址和IP地址。 ... [详细]

蜡笔小新 2023-12-13 16:50:29
ip
拥抱Android Design Support Library新变化（导航视图、悬浮ActionBar）

转载请注明明桑AndroidAndroid5.0Loollipop作为Android最重要的版本之一，为我们带来了全新的界面风格和设计语言。看起来很受欢迎࿰ ... [详细]

蜡笔小新 2023-12-13 16:11:00
jsp
ABAP开发发送邮件程序的配置和代码整理

本文介绍了通过ABAP开发往外网发邮件的需求，并提供了配置和代码整理的资料。其中包括了配置SAP邮件服务器的步骤和ABAP写发送邮件代码的过程。通过RZ10配置参数和icm/server_port_1的设定，可以实现向Sap User和外部邮件发送邮件的功能。希望对需要的开发人员有帮助。摘要长度：184字。 ... [详细]

蜡笔小新 2023-12-13 15:50:17
get
Golang如何使用Cookie跟踪位置

关键词：Golang, Cookie, 跟踪位置, net/http/cookiejar, package main, golang.org/x/net/publicsuffix, io/ioutil, log, net/http, net/http/cookiejar ... [详细]

蜡笔小新 2023-12-13 15:47:22
python
Python连接服务器失败：使用aiohttp模拟服务器出现错误问题及解决方法

本文介绍了在使用Python中的aiohttp模块模拟服务器时出现的连接失败问题，并提供了相应的解决方法。文章中详细说明了出错的代码以及相关的软件版本和环境信息，同时也提到了相关的警告信息和函数的替代方案。通过阅读本文，读者可以了解到如何解决Python连接服务器失败的问题，并对aiohttp模块有更深入的了解。 ... [详细]

蜡笔小新 2023-12-13 12:37:59
ip
Python瓦片图下载、合并、绘图、标记的代码示例

本文提供了Python瓦片图下载、合并、绘图、标记的代码示例，包括下载代码、多线程下载、图像处理等功能。通过参考geoserver，使用PIL、cv2、numpy、gdal、osr等库实现了瓦片图的下载、合并、绘图和标记功能。代码示例详细介绍了各个功能的实现方法，供读者参考使用。 ... [详细]

蜡笔小新 2023-12-13 12:14:55
io
WebSocket与Socket.io的理解

WebSocketprotocol是HTML5一种新的协议。它的最大特点就是，服务器可以主动向客户端推送信息，客户端也可以主动向服务器发送信息，是真正的双向平等对话，属于服务器推送 ... [详细]

蜡笔小新 2023-12-12 19:35:15
get
java命令运行

Java在运行已编译完成的类时，是通过java虚拟机来装载和执行的，java虚拟机通过操作系统命令JAVA_HOMEbinjava–option来启 ... [详细]

蜡笔小新 2023-12-12 19:26:55
ip
Linux下Kafka单机安装配置方法（实操成功）

本文介绍了在Linux下安装和配置Kafka的方法，包括安装JDK、下载和解压Kafka、配置Kafka的参数，以及配置Kafka的日志目录、服务器IP和日志存放路径等。同时还提供了单机配置部署的方法和zookeeper地址和端口的配置。通过实操成功的案例，帮助读者快速完成Kafka的安装和配置。 ... [详细]

蜡笔小新 2023-12-12 18:14:32

白羊蓝色雨线

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章