0%

原文

Calculate the number of ways to place nn rooks on n×nn\times n chessboard so that both following conditions are met:

  • each empty cell is under attack;
  • exactly kk pairs of rooks attack each other.

An empty cell is under attack if there is at least one rook in the same row or at least one rook in the same column. Two rooks attack each other if they share the same row or column, and there are no other rooks between them. For example, there are only two pairs of rooks that attack each other in the following picture:

One of the ways to place the rooks for n=3n=3 and k=2k=2

Two ways to place the rooks are considered different if there exists at least one cell which is empty in one of the ways but contains a rook in another way.

The answer might be large, so print it modulo 998244353.

Input

The only line of the input contains two integers nn and kk (1n2000001\le n\le 200000; 0kn(n1)20\le k\le\frac{n(n-1)}{2}).

Output

Print one integer — the number of ways to place the rooks, taken modulo 998244353.

Examples

input

1
3 2

output

1
6

input

1
3 3

output

1
0

input

1
4 0

output

1
24

input

1
1337 42

output

1
807905441

题意

rook可以对同一行和同一列攻击,提供两个数nnkk,求在n×nn\times n的棋盘上放置nn个rook使得:

  • 所有棋盘格都可以被攻击到
  • kk对rook可以互相供给

的放置方法总数。

思路

由于所有地方都需要被攻击到,那么就相当于一定是每个行都有一个rook或者每个列都有一个rook.本质上是一个排列组合问题,可以发现,行和列实际上是对称的,所以只需要考虑行列当中的一种。

假设每个行上都有一个rook,如果k0k\ne0,那么必然是nn个rook分布在nkn-k个列上。对应的排列方法一共有:

(nk)nCnknk1(nk1)n+Cnknk2(nk2)n=i=0nk(1)iCnki(nki)n(n-k)^n - C_{n-k}^{n-k-1}(n-k-1)^n+C_{n-k}^{n-k-2}(n-k-2)^n=\sum_{i=0}^{n-k}(-1)^i C_{n-k}^{i}(n-k-i)^n

从上面的分析也可以发现,如果knk\ge n,那么不存在符合的情况。再注意当k0k\ne0的时候需要,需要乘2,而k=0k=0的时候直接作为结果就可以了。

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
#include <bits/stdc++.h>
using namespace std;

#define ll long long
#define MOD 998244353

#define add(x,y) ((x+y)%MOD)
#define mul(x,y) ((x*y)%MOD)
#define sub(x,y) ((MOD+x-y)%MOD)

ll fastpow(ll x, ll y){
ll temp = 1;
while(y){
if(y&1)
temp = mul(temp, x);
x = mul(x, x);
y>>=1;
}
return temp;
}

ll inv(ll x){
return fastpow(x,MOD-2);
}

ll frac[200100];

ll C(ll n, ll i){
return mul(frac[n], inv(mul(frac[i], frac[n-i])));
}

int main(){
ios::sync_with_stdio(0);
cin.tie(0);
cout.tie(0);

int n,k;
cin>>n>>k;

if(k>=n){
cout<<0<<endl;
return 0;
}

ll ans = 0;
frac[0] = 1;
for(int i=1;i<=n;++i){
frac[i] = mul(i, frac[i-1]);
}
for(int i=0;i<=n-k;++i){
ll temp = mul(C(n-k,i), fastpow(n-k-i,n));
if(i&1)
ans = sub(ans, temp);
else
ans = add(ans,temp);
}

ans = mul(ans, C(n, n-k));
if(k)
ans = mul(ans, 2);
cout<<ans<<endl;
}

Exercise 1

这个建立映射的函数实际上和kern/pmap.c当中的boot_alloc()存在类似的地方,都是利用一个静态的变量来保存当前分配空间的起始地方,然后不断的增长进行分配。由于base每次都会增长,所以每次都是映射新的页面。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
//
// Reserve size bytes in the MMIO region and map [pa,pa+size) at this
// location. Return the base of the reserved region. size does *not*
// have to be multiple of PGSIZE.
//
void *
mmio_map_region(physaddr_t pa, size_t size)
{
// Where to start the next region. Initially, this is the
// beginning of the MMIO region. Because this is static, its
// value will be preserved between calls to mmio_map_region
// (just like nextfree in boot_alloc).
static uintptr_t base = MMIOBASE;

uintptr_t result;

// Reserve size bytes of virtual memory starting at base and
// map physical pages [pa,pa+size) to virtual addresses
// [base,base+size). Since this is device memory and not
// regular DRAM, you'll have to tell the CPU that it isn't
// safe to cache access to this memory. Luckily, the page
// tables provide bits for this purpose; simply create the
// mapping with PTE_PCD|PTE_PWT (cache-disable and
// write-through) in addition to PTE_W. (If you're interested
// in more details on this, see section 10.5 of IA32 volume
// 3A.)
//
// Be sure to round size up to a multiple of PGSIZE and to
// handle if this reservation would overflow MMIOLIM (it's
// okay to simply panic if this happens).
//
// Hint: The staff solution uses boot_map_region.
//
// Your code here:
if(base + ROUNDUP(size, PGSIZE) >= MMIOLIM)
panic("mmio_map_region: out of memory\n");
boot_map_region(kern_pgdir, base, size, pa, PTE_PCD | PTE_PWT | PTE_W);
result = base;
base += ROUNDUP(size, PGSIZE);
return (void *)result;
}

Exercise 2

修改page_init()的内容如下,只是增加一个特殊处理:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// case 1:
pages[0].pp_ref = 1;
pages[0].pp_link = NULL;
// case 2, 3, 4:
for (i = 1; i < npages; i++) {
if((IOPHYSMEM <= i * PGSIZE && i * PGSIZE < pa_free_start) || page2pa(pages + i) == MPENTRY_PADDR)
{
pages[i].pp_ref = 1;
pages[i].pp_link = NULL;
}
else
{
pages[i].pp_ref = 0;
pages[i].pp_link = page_free_list;
page_free_list = &pages[i];
}
}

只要在第6行添加一个判断就可以了。

之后执行make qemu就可以看到如下的check都已经成功了。

1
2
3
check_page_free_list() succeeded!
check_page_alloc() succeeded!
check_page() succeeded!

Question 1

MPBOOTPHYS的作用是将高地址转换为低地址,使得可以在实模式下进行访问。在boot.S当中,本来就已经被链接到了低地址,不需要进行转换,而mpentry.S的代码都位于KERNBASE的上方,所以需要手动的利用MPBOOTPHYS宏进行转换。

Exercise 3

这里直接利用一个循环进行映射,由于每个栈之间是存在一个Gap的,所以只有[kstacktop_i - KSTKSIZE, kstacktop_i)部分需要进行映射。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Modify mappings in kern_pgdir to support SMP
// - Map the per-CPU stacks in the region [KSTACKTOP-PTSIZE, KSTACKTOP)
//
static void
mem_init_mp(void)
{
// Map per-CPU stacks starting at KSTACKTOP, for up to 'NCPU' CPUs.
//
// For CPU i, use the physical memory that 'percpu_kstacks[i]' refers
// to as its kernel stack. CPU i's kernel stack grows down from virtual
// address kstacktop_i = KSTACKTOP - i * (KSTKSIZE + KSTKGAP), and is
// divided into two pieces, just like the single stack you set up in
// mem_init:
// * [kstacktop_i - KSTKSIZE, kstacktop_i)
// -- backed by physical memory
// * [kstacktop_i - (KSTKSIZE + KSTKGAP), kstacktop_i - KSTKSIZE)
// -- not backed; so if the kernel overflows its stack,
// it will fault rather than overwrite another CPU's stack.
// Known as a "guard page".
// Permissions: kernel RW, user NONE
//
// LAB 4: Your code here:
uint32_t i, kstacktop_i;
for(i=0, kstacktop_i=KSTACKTOP;i < NCPU; ++i, kstacktop_i -= KSTKSIZE + KSTKGAP)
boot_map_region(kern_pgdir, kstacktop_i - KSTKSIZE, KSTKSIZE, PADDR(percpu_kstacks[i]), PTE_W);
}

Exercise 4

这里实际上就是用thiscpu来替换原本的全局变量,使得本来是在lab3当中对于单CPU适用的情况可以适用于多个CPU的情形。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
// Initialize and load the per-CPU TSS and IDT
void
trap_init_percpu(void)
{
// The example code here sets up the Task State Segment (TSS) and
// the TSS descriptor for CPU 0. But it is incorrect if we are
// running on other CPUs because each CPU has its own kernel stack.
// Fix the code so that it works for all CPUs.
//
// Hints:
// - The macro "thiscpu" always refers to the current CPU's
// struct CpuInfo;
// - The ID of the current CPU is given by cpunum() or
// thiscpu->cpu_id;
// - Use "thiscpu->cpu_ts" as the TSS for the current CPU,
// rather than the global "ts" variable;
// - Use gdt[(GD_TSS0 >> 3) + i] for CPU i's TSS descriptor;
// - You mapped the per-CPU kernel stacks in mem_init_mp()
// - Initialize cpu_ts.ts_iomb to prevent unauthorized environments
// from doing IO (0 is not the correct value!)
//
// ltr sets a 'busy' flag in the TSS selector, so if you
// accidentally load the same TSS on more than one CPU, you'll
// get a triple fault. If you set up an individual CPU's TSS
// wrong, you may not get a fault until you try to return from
// user space on that CPU.
//
// LAB 4: Your code here:

// Setup a TSS so that we get the right stack
// when we trap to the kernel.
thiscpu->cpu_ts.ts_esp0 = KSTACKTOP - cpunum() * (KSTKSIZE + KSTKGAP);
thiscpu->cpu_ts.ts_ss0 = GD_KD;
thiscpu->cpu_ts.ts_iomb = sizeof(struct Taskstate);

// Initialize the TSS slot of the gdt.
gdt[(GD_TSS0 >> 3) + cpunum()] = SEG16(STS_T32A, (uint32_t) (&(thiscpu->cpu_ts)),
sizeof(struct Taskstate) - 1, 0);
gdt[(GD_TSS0 >> 3) + cpunum()].sd_s = 0;

// Load the TSS selector (like other segment selectors, the
// bottom three bits are special; we leave them 0)
ltr(GD_TSS0 + (cpunum() << 3));

// Load the IDT
lidt(&idt_pd);
}

Exercise 5

根据文档的描述在四个所需要插入大内核锁的地方进行lock_kernel()unlock_kernel()的操作。

  • In i386_init(), acquire the lock before the BSP wakes up the other CPUs.

    1
    2
    3
    4
    5
    6
    // Acquire the big kernel lock before waking up APs
    // Your code here:
    lock_kernel();

    // Starting non-boot CPUs
    boot_aps();
  • In mp_main(), acquire the lock after initializing the AP, and then call sched_yield() to start running environments on this AP.

    1
    2
    3
    4
    5
    6
    7
    // Now that we have finished some basic setup, call sched_yield()
    // to start running processes on this CPU. But make sure that
    // only one CPU can enter the scheduler at a time!
    //
    // Your code here:
    lock_kernel();
    sched_yield();
  • In trap(), acquire the lock when trapped from user mode. To determine whether a trap happened in user mode or in kernel mode, check the low bits of the tf_cs.

    1
    2
    3
    4
    5
    6
    // Trapped from user mode.
    // Acquire the big kernel lock before doing any
    // serious kernel work.
    // LAB 4: Your code here.
    lock_kernel();
    assert(curenv);
  • In env_run(), release the lock right before switching to user mode. Do not do that too early or too late, otherwise you will experience races or deadlocks.

    1
    2
    3
    lcr3(PADDR(e->env_pgdir));
    unlock_kernel();
    env_pop_tf(&(e->env_tf));

Question 2

从trapentry.S当中可以看到,在调用trap()之前(还没有获得大内核锁),这个时候就已经往内核栈中压入了寄存器信息,如果内核栈不分离的话,在这个时候切换就会造成错误。

Exercise 6

这里的调度方法实际上是一个非常暴力的轮询,如果找到了一个状态是ENV_RUNNABLE的进程那么就让他上CPU。如果找了一圈都没有找到合适的进程的话,就看起始进程,如果它本来就在当前CPU上运行的话,那么就继续运行,否则的话一个进程不能在两个CPU上同时运行,就调用sched_halt()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Choose a user environment to run and run it.
void
sched_yield(void)
{
struct Env *idle;

// Implement simple round-robin scheduling.
//
// Search through 'envs' for an ENV_RUNNABLE environment in
// circular fashion starting just after the env this CPU was
// last running. Switch to the first such environment found.
//
// If no envs are runnable, but the environment previously
// running on this CPU is still ENV_RUNNING, it's okay to
// choose that environment.
//
// Never choose an environment that's currently running on
// another CPU (env_status == ENV_RUNNING). If there are
// no runnable environments, simply drop through to the code
// below to halt the cpu.

// LAB 4: Your code here.
int start_i, i;
if(!curenv)
start_i = 0;
else
start_i = curenv->env_id;
for(i = 0; i < NENV; ++i)
if(envs[(start_i + i)%NENV].env_status == ENV_RUNNABLE)
env_run(&envs[(start_i + i)%NENV]);
if(envs[start_i%NENV].env_status == ENV_RUNNING && envs[start_i%NENV].env_cpunum == cpunum())
env_run(&envs[start_i%NENV]);
// sched_halt never returns
sched_halt();
}

syscall()当中添加新的系统调用的分发:

1
2
3
case SYS_yield:
sys_yield();
return 0;

mp_main()当中调用sched_yield()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Setup code for APs
void
mp_main(void)
{
// We are in high EIP now, safe to switch to kern_pgdir
lcr3(PADDR(kern_pgdir));
cprintf("SMP: CPU %d starting\n", cpunum());

lapic_init();
env_init_percpu();
trap_init_percpu();
xchg(&thiscpu->cpu_status, CPU_STARTED); // tell boot_aps() we're up

// Now that we have finished some basic setup, call sched_yield()
// to start running processes on this CPU. But make sure that
// only one CPU can enter the scheduler at a time!
//
// Your code here:
lock_kernel();
sched_yield();

// Remove this after you finish Exercise 6
for (;;);
}

Question 3

env_run()当中对应代码部分如下:

1
2
3
4
5
6
7
8
9
10
if(curenv != NULL && curenv->env_status == ENV_RUNNING)
curenv->env_status = ENV_RUNNABLE;
curenv = e;
e->env_status = ENV_RUNNING;
++(e->env_runs);
lcr3(PADDR(e->env_pgdir));

unlock_kernel();

env_pop_tf(&(e->env_tf));

lcr3()前后都能够正常对e进行解引用是因为,在设置env_pgdir的时候是以kern_pgdir作为模板进行修改的,e地址在两个地址空间中是映射到同一个物理地址的,所以这里进行解引用的操作不会有问题。

Question 4

保存寄存器信息的操作发生在kern/trapentry.S当中:

1
2
3
4
5
6
7
8
9
10
11
.global _alltraps
_alltraps:
pushl %ds
pushl %es
pushal
pushl $GD_KD
popl %ds
pushl $GD_KD
popl %es
pushl %esp
call trap

恢复寄存器的操作发生在kern/env.c的env_pop_tf()当中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
void
env_pop_tf(struct Trapframe *tf)
{
// Record the CPU we are running on for user-space debugging
curenv->env_cpunum = cpunum();

asm volatile(
"\tmovl %0,%%esp\n"
"\tpopal\n"
"\tpopl %%es\n"
"\tpopl %%ds\n"
"\taddl $0x8,%%esp\n" /* skip tf_trapno and tf_errcode */
"\tiret\n"
: : "g" (tf) : "memory");
panic("iret failed"); /* mostly to placate the compiler */
}

Exercise 7

这里每一个系统调用的主要内容都不复杂,主要的是进行许多参数有效性的检查,只需要按照注释中的内容进行参数检查就可以。

sys_exofork()

传建一个子进程,在子进程中返回值为0,在父进程中返回的是子进程的id,先将子进程的状态设置成ENV_NOT_RUNNABLE之后再进行地址空间的复制之后可以会再设置成可运行的状态。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// Allocate a new environment.
// Returns envid of new environment, or < 0 on error. Errors are:
// -E_NO_FREE_ENV if no free environment is available.
// -E_NO_MEM on memory exhaustion.
static envid_t
sys_exofork(void)
{
// Create the new environment with env_alloc(), from kern/env.c.
// It should be left as env_alloc created it, except that
// status is set to ENV_NOT_RUNNABLE, and the register set is copied
// from the current environment -- but tweaked so sys_exofork
// will appear to return 0.

// LAB 4: Your code here.
struct Env* e;
int ret;
if((ret = env_alloc(&e, curenv->env_id)))
return ret;
e->env_tf = curenv->env_tf;
e->env_tf.tf_regs.reg_eax = 0;
e->env_status = ENV_NOT_RUNNABLE;

return e->env_id;
}

sys_env_set_status()

就是23行处的设置env_status。但是这个系统调用只能在ENV_RUNNABLEENV_NOT_RUNNABLE当中设置。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// Set envid's env_status to status, which must be ENV_RUNNABLE
// or ENV_NOT_RUNNABLE.
//
// Returns 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
// -E_INVAL if status is not a valid status for an environment.
static int
sys_env_set_status(envid_t envid, int status)
{
// Hint: Use the 'envid2env' function from kern/env.c to translate an
// envid to a struct Env.
// You should set envid2env's third argument to 1, which will
// check whether the current environment has permission to set
// envid's status.

// LAB 4: Your code here.
struct Env* env;
if(envid2env(envid, &env, 1))
return -E_BAD_ENV;
if(status != ENV_RUNNABLE && status != ENV_NOT_RUNNABLE)
return -E_INVAL;
env->env_status = status;
return 0;
}

sys_page_alloc()

envid的地址空间中分配一个页面,除去类型检查之外所做的内容就是page_alloc()page_insert()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Allocate a page of memory and map it at 'va' with permission
// 'perm' in the address space of 'envid'.
// The page's contents are set to 0.
// If a page is already mapped at 'va', that page is unmapped as a
// side effect.
//
// perm -- PTE_U | PTE_P must be set, PTE_AVAIL | PTE_W may or may not be set,
// but no other bits may be set. See PTE_SYSCALL in inc/mmu.h.
//
// Return 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
// -E_INVAL if va >= UTOP, or va is not page-aligned.
// -E_INVAL if perm is inappropriate (see above).
// -E_NO_MEM if there's no memory to allocate the new page,
// or to allocate any necessary page tables.
static int
sys_page_alloc(envid_t envid, void *va, int perm)
{
// Hint: This function is a wrapper around page_alloc() and
// page_insert() from kern/pmap.c.
// Most of the new code you write should be to check the
// parameters for correctness.
// If page_insert() fails, remember to free the page you
// allocated!

// LAB 4: Your code here.
struct Env* env;
if(envid2env(envid, &env, 1))
return -E_BAD_ENV;
if((uint32_t)va >= UTOP || va != ROUNDDOWN(va, PGSIZE))
return -E_INVAL;
if((perm & (PTE_U | PTE_P)) != (PTE_U | PTE_P) || perm & (~PTE_SYSCALL))
return -E_INVAL;

struct PageInfo * pp;
if(!(pp = page_alloc(1)))
return -E_NO_MEM;
if(page_insert(env->env_pgdir, pp, va, perm))
{
page_free(pp);
return -E_NO_MEM;
}
return 0;
}

sys_page_map()

37行之前为参数的检查,39行之后为具体执行的内容,实际上完成的就是将srcenvid对应进程的地址空间中的srcva页面映射到dstenvid对应进程的地址空间中的dstva页面。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
// Map the page of memory at 'srcva' in srcenvid's address space
// at 'dstva' in dstenvid's address space with permission 'perm'.
// Perm has the same restrictions as in sys_page_alloc, except
// that it also must not grant write access to a read-only
// page.
//
// Return 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if srcenvid and/or dstenvid doesn't currently exist,
// or the caller doesn't have permission to change one of them.
// -E_INVAL if srcva >= UTOP or srcva is not page-aligned,
// or dstva >= UTOP or dstva is not page-aligned.
// -E_INVAL is srcva is not mapped in srcenvid's address space.
// -E_INVAL if perm is inappropriate (see sys_page_alloc).
// -E_INVAL if (perm & PTE_W), but srcva is read-only in srcenvid's
// address space.
// -E_NO_MEM if there's no memory to allocate any necessary page tables.
static int
sys_page_map(envid_t srcenvid, void *srcva,
envid_t dstenvid, void *dstva, int perm)
{
// Hint: This function is a wrapper around page_lookup() and
// page_insert() from kern/pmap.c.
// Again, most of the new code you write should be to check the
// parameters for correctness.
// Use the third argument to page_lookup() to
// check the current permissions on the page.

// LAB 4: Your code here.
struct Env *srcenv, *dstenv;
if(envid2env(srcenvid, &srcenv, 1) || envid2env(dstenvid, &dstenv, 1))
return -E_BAD_ENV;
if((uint32_t)srcva >= UTOP || srcva != ROUNDDOWN(srcva, PGSIZE))
return -E_INVAL;
if((uint32_t)dstva >= UTOP || dstva != ROUNDDOWN(dstva, PGSIZE))
return -E_INVAL;
if((perm & (PTE_U | PTE_P)) != (PTE_U | PTE_P) || perm & (~PTE_SYSCALL))
return -E_INVAL;

pte_t *pte;
struct PageInfo *pp;
if(!(pp = page_lookup(srcenv->env_pgdir, srcva, &pte)))
return -E_INVAL;
if((((*pte) & PTE_W) == 0) && (perm & PTE_W))
return -E_INVAL;
return page_insert(dstenv->env_pgdir, pp, dstva, perm);
}

sys_page_unmap()

实际上就是19行处的page_remove()操作,剩下的是参数的有效性检查。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Unmap the page of memory at 'va' in the address space of 'envid'.
// If no page is mapped, the function silently succeeds.
//
// Return 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
// -E_INVAL if va >= UTOP, or va is not page-aligned.
static int
sys_page_unmap(envid_t envid, void *va)
{
// Hint: This function is a wrapper around page_remove().

// LAB 4: Your code here.
struct Env* env;
if(envid2env(envid, &env, 1))
return -E_BAD_ENV;
if((uint32_t)va >= UTOP || va != ROUNDDOWN(va, PGSIZE))
return -E_INVAL;
page_remove(env->env_pgdir, va);
return 0;
}

最后要在syscall()当中添加分发的方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
case SYS_exofork:
return sys_exofork();

case SYS_env_set_status:
return sys_env_set_status((envid_t)a1, (int)a2);

case SYS_page_alloc:
return sys_page_alloc((envid_t)a1, (void *)a2, (int) a3);

case SYS_page_map:
return sys_page_map((envid_t)a1, (void *)a2, (envid_t)a3, (void *)a4, (int)a5);

case SYS_page_unmap:
return sys_page_unmap((envid_t)a1, (void *)a2);

Exercise 8

又是一个系统调用的设置,当使用envid2env()的时候需要进行权限的检查,如果能够正常的得到env的话就设置对应的env_pgfault_upcall。同样要在syscall()当中添加新的case。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Set the page fault upcall for 'envid' by modifying the corresponding struct
// Env's 'env_pgfault_upcall' field. When 'envid' causes a page fault, the
// kernel will push a fault record onto the exception stack, then branch to
// 'func'.
//
// Returns 0 on success, < 0 on error. Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist,
// or the caller doesn't have permission to change envid.
static int
sys_env_set_pgfault_upcall(envid_t envid, void *func)
{
// LAB 4: Your code here.
struct Env* env;
if(envid2env(envid, &env, 1))
return -E_BAD_ENV;
env->env_pgfault_upcall = func;
return 0;
//panic("sys_env_set_pgfault_upcall not implemented");
}

Exercise 9

这里关于page_fault_handler()在有env_pgfault_upcall的情况下,分为两种情况,如果本身在Exception Stack里面的话,那么需要空出一个word的大小,具体的作用在后面Exercise 10会体现。否则的话直接将结构体压在Exception Stack的底部就可以了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// LAB 4: Your code here.
if(curenv->env_pgfault_upcall){
struct UTrapframe * utf;
if(ROUNDDOWN(tf->tf_esp, PGSIZE) == UXSTACKTOP - PGSIZE)
utf = (struct UTrapframe *)((tf->tf_esp) - sizeof(struct UTrapframe) - 4);
else
utf = (struct UTrapframe *)(UXSTACKTOP - sizeof(struct UTrapframe));
user_mem_assert(curenv, (void *)utf, sizeof(struct UTrapframe), PTE_W);
utf->utf_fault_va = fault_va;
utf->utf_err = tf->tf_err;
utf->utf_regs = tf->tf_regs;
utf->utf_eip = tf->tf_eip;
utf->utf_eflags = tf->tf_eflags;
utf->utf_esp = tf->tf_esp;
curenv->env_tf.tf_eip = (uintptr_t)curenv->env_pgfault_upcall;
curenv->env_tf.tf_esp = (uintptr_t)utf;
env_run(curenv);
}
else{
// Destroy the environment that caused the fault.
cprintf("[%08x] user fault va %08x ip %08x\n",
curenv->env_id, fault_va, tf->tf_eip);
print_trapframe(tf);
env_destroy(curenv);
}

第8行为写入的权限检查,之后9-14行为struct UTrapframe整个结构体的压入,然后修改curenv里面的内容,转入env_pgfault_upcall当中执行。

如果没有env_pgfault_upcall的话,那么就执行env_destroy()的操作。

Exercise 10

Exception Stack中的结构如下所示:

1
2
3
4
5
6
7
8
9
//	trap-time esp
// trap-time eflags
// trap-time eip
// utf_regs.reg_eax
// ...
// utf_regs.reg_esi
// utf_regs.reg_edi
// utf_err (error code)
// utf_fault_va <-- %esp

补全的_pgfault_upcall的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// LAB 4: Your code here.
movl 0x28(%esp), %edi
movl 0x30(%esp), %esi
subl $4, %esi
movl %edi, (%esi)
movl %esi, 0x30(%esp)

// Restore the trap-time registers. After you do this, you
// can no longer modify any general-purpose registers.
// LAB 4: Your code here.
addl $8, %esp
popal

// Restore eflags from the stack. After you do this, you can
// no longer use arithmetic operations or anything else that
// modifies eflags.
// LAB 4: Your code here.
addl $4, %esp
popfl

// Switch back to the adjusted trap-time stack.
// LAB 4: Your code here.
popl %esp

// Return to re-execute the instruction that faulted.
// LAB 4: Your code here.
ret

这里要实现栈切换同时需要保存%eip,首先在2、3行,将%eip取出放入%edi中,%esp取出放入%esi中,之后将%esp向下延伸一个word的大小,然后把%eip填入,之后将修改后的%esp放回保存的位置。

这样最终得到的%esp所指向的栈顶第一个元素就是我们之前所保存的%eip寄存器的值,就同时完成了栈的切换和%eip的恢复。后面就是不断退栈恢复寄存器的过程了,非常简单。

这里如果是在Exception Stack当中的重复调用,由于之前确保重复调用会在每两个结构之间留下一个word大小的gap,这个空隙就可以填入%eip保证以上的upcall在重复调用的情况下也能正常工作。

Exercise 11

如果是第一次进行调用的话,那么需要进行初始化的设置,即给Exception Stack分配空间(17行),同时设置pgfault_upcall(19行)。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
//
// Set the page fault handler function.
// If there isn't one yet, _pgfault_handler will be 0.
// The first time we register a handler, we need to
// allocate an exception stack (one page of memory with its top
// at UXSTACKTOP), and tell the kernel to call the assembly-language
// _pgfault_upcall routine when a page fault occurs.
//
void
set_pgfault_handler(void (*handler)(struct UTrapframe *utf))
{
int r;

if (_pgfault_handler == 0) {
// First time through!
// LAB 4: Your code here.
if(sys_page_alloc(0, (void *)(UXSTACKTOP - PGSIZE), PTE_U | PTE_P | PTE_W))
panic("set_pgfault_handler: page alloc fault!");
if(sys_env_set_pgfault_upcall(0, (void *)_pgfault_upcall))
panic("set_pgfault handler: set pgfault upcall failed!");
}
// Save handler pointer for assembly to call.
_pgfault_handler = handler;
}

Exercise 12

pgfault() 可以参照dumbfork.c里面的duppage(),事实上dumbfork就是全部都进行一个复制,而COW的fork()只有在写入写时复制页面的时候才会进行复制,所以这里首先进行一个检查,看是不是写入一个COW页面所产生的错误。如果是的话,就分配一个新的页面并且将整个页面的内容拷贝一份,这里如注释中所写明的利用三次系统调用实现。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
//
// Custom page fault handler - if faulting page is copy-on-write,
// map in our own private writable copy.
//
static void
pgfault(struct UTrapframe *utf)
{
void *addr = (void *) utf->utf_fault_va;
uint32_t err = utf->utf_err;
int r;

// Check that the faulting access was (1) a write, and (2) to a
// copy-on-write page. If not, panic.
// Hint:
// Use the read-only page table mappings at uvpt
// (see <inc/memlayout.h>).

// LAB 4: Your code here.
if(!((err & FEC_WR) && (uvpt[PGNUM(addr)] & PTE_COW)))
panic("pgfault: 0x%08x the fault page is not writable or copy-on-write page!", addr);

// Allocate a new page, map it at a temporary location (PFTEMP),
// copy the data from the old page to the new page, then move the new
// page to the old page's address.
// Hint:
// You should make three system calls.

// LAB 4: Your code here.
addr = ROUNDDOWN(addr, PGSIZE);
if((r = sys_page_alloc(0, PFTEMP, PTE_P|PTE_U|PTE_W)) < 0)
panic("pgfault: sys_page_alloc fail, %e", r);
memmove(PFTEMP, addr, PGSIZE);
if ((r = sys_page_map(0, PFTEMP, 0, addr, PTE_P|PTE_U|PTE_W)) < 0)
panic("pgfault: sys_page_map, %e", r);
if ((r = sys_page_unmap(0, PFTEMP)) < 0)
panic("pgfault: sys_page_unmap, %e", r);
}

这里duppage()的实现就是按照注释中的内容进行,首先判断原本的页面是不是writable或者COW的,如果是的话那么就将其perm设置成写时复制的。之后现在子进程的地址空间中进行映射,再在父进程的地址空间中进行映射。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
//
// Map our virtual page pn (address pn*PGSIZE) into the target envid
// at the same virtual address. If the page is writable or copy-on-write,
// the new mapping must be created copy-on-write, and then our mapping must be
// marked copy-on-write as well. (Exercise: Why do we need to mark ours
// copy-on-write again if it was already copy-on-write at the beginning of
// this function?)
//
// Returns: 0 on success, < 0 on error.
// It is also OK to panic on error.
//
static int
duppage(envid_t envid, unsigned pn)
{
int r;

// LAB 4: Your code here.
pte_t pte = uvpt[pn];
void * addr = (void *)(pn * PGSIZE);

uint32_t perm = pte & 0xFFF;
if(perm & (PTE_W | PTE_COW)){
perm &= ~PTE_W;
perm |= PTE_COW;
}
if((r = sys_page_map(0, addr, envid, addr, perm & PTE_SYSCALL))<0)
panic("duppage: %e", r);
if((r = sys_page_map(0, addr, 0, addr, perm & PTE_SYSCALL))<0)
panic("duppage: %e", r);
return 0;
}

fork()函数可以参照dumbfork的主体部分,由于只要赋值UTOP以下的地址空间,而Exception Stack是另外进行分配的,所以采用COW的复制方式到USTACKTOP就为止了。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//
// User-level fork with copy-on-write.
// Set up our page fault handler appropriately.
// Create a child.
// Copy our address space and page fault handler setup to the child.
// Then mark the child as runnable and return.
//
// Returns: child's envid to the parent, 0 to the child, < 0 on error.
// It is also OK to panic on error.
//
// Hint:
// Use uvpd, uvpt, and duppage.
// Remember to fix "thisenv" in the child process.
// Neither user exception stack should ever be marked copy-on-write,
// so you must allocate a new page for the child's user exception stack.
//
envid_t
fork(void)
{
// LAB 4: Your code here.
int r;
envid_t envid;
uint8_t * addr;
set_pgfault_handler(pgfault);
envid = sys_exofork();
if(envid < 0)
panic("fork: sys_exofork failed!");
if(envid == 0){
thisenv = &envs[ENVX(sys_getenvid())];
return 0;
}

for(addr = (uint8_t *)UTEXT; addr <(uint8_t *)USTACKTOP; addr += PGSIZE)
if((uvpd[PDX(addr)] & PTE_P) && (uvpt[PGNUM(addr)] & PTE_P))
duppage(envid, PGNUM(addr));

if((r = sys_page_alloc(envid, (void *)(UXSTACKTOP-PGSIZE), PTE_W|PTE_P|PTE_U))<0)
panic("fork: sys_page_alloc failed, %e", r);

extern void _pgfault_upcall();
if((r = sys_env_set_pgfault_upcall(envid, _pgfault_upcall)))
panic("fork: sys_env_set_pgfault_upcall failed, %e", r);

if((r = sys_env_set_status(envid, ENV_RUNNABLE))<0)
panic("fork: sys_env_set_status failed, %e", r);

return envid;
}

Exercise 13

kern/trapentry.S和kern/trap.c当中由于我是用的是lab3里面challenge所描述的循环写法,这里并不需要做修改。

在kern/env.c的env_alloc()函数中设定EFLAG

1
2
3
// Enable interrupts while in user mode.
// LAB 4: Your code here.
e->env_tf.tf_eflags |= FL_IF;

sched_halt()当中所需要注意的就是取消掉sti的注释,设置IF位使得空闲CPU并不会屏蔽中断。

1
2
3
4
5
6
7
8
9
10
11
asm volatile (
"movl $0, %%ebp\n"
"movl %0, %%esp\n"
"pushl $0\n"
"pushl $0\n"
// Uncomment the following line after completing exercise 13
"sti\n"
"1:\n"
"hlt\n"
"jmp 1b\n"
: : "a" (thiscpu->cpu_ts.ts_esp0));

Exercise 14

只需要在trap_dispatch()当中添加分发的分支即可,这里需要按照注释内容在进行sched_yield()之前调用lapic_eoi()来确认中断。

1
2
3
4
5
6
7
8
// Handle clock interrupts. Don't forget to acknowledge the
// interrupt using lapic_eoi() before calling the scheduler!
// LAB 4: Your code here.
if(tf->tf_trapno == IRQ_OFFSET + IRQ_TIMER){
lapic_eoi();
sched_yield();
return;
}

Exercise 15

sys_ipc_recv()当中主要做的操作就是首先进行参数的检查,检查完了之后将其填入env当中,并且让出CPU等待发送消息的进程将其重新设置为RUNNABLE

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// Block until a value is ready.  Record that you want to receive
// using the env_ipc_recving and env_ipc_dstva fields of struct Env,
// mark yourself not runnable, and then give up the CPU.
//
// If 'dstva' is < UTOP, then you are willing to receive a page of data.
// 'dstva' is the virtual address at which the sent page should be mapped.
//
// This function only returns on error, but the system call will eventually
// return 0 on success.
// Return < 0 on error. Errors are:
// -E_INVAL if dstva < UTOP but dstva is not page-aligned.
static int
sys_ipc_recv(void *dstva)
{
// LAB 4: Your code here.
struct Env * env;
if(envid2env(0, &env, 0))
return -E_BAD_ENV;
if((uint32_t)dstva < UTOP && (dstva != ROUNDDOWN(dstva, PGSIZE)))
return -E_INVAL;

env->env_ipc_dstva = dstva;
env->env_ipc_recving = true;
env->env_status = ENV_NOT_RUNNABLE;
sys_yield();

return 0;
}

sys_ipc_try_send()的操作主要是对于注释里面所提到的所有可能的错误情形进行检查,当srcva < UTOP的时候,和sys_page_map()当中的处理非常相似。在最终修改接收方env里面对应的值,并且将返回值设置成0。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
// Try to send 'value' to the target env 'envid'.
// If srcva < UTOP, then also send page currently mapped at 'srcva',
// so that receiver gets a duplicate mapping of the same page.
//
// The send fails with a return value of -E_IPC_NOT_RECV if the
// target is not blocked, waiting for an IPC.
//
// The send also can fail for the other reasons listed below.
//
// Otherwise, the send succeeds, and the target's ipc fields are
// updated as follows:
// env_ipc_recving is set to 0 to block future sends;
// env_ipc_from is set to the sending envid;
// env_ipc_value is set to the 'value' parameter;
// env_ipc_perm is set to 'perm' if a page was transferred, 0 otherwise.
// The target environment is marked runnable again, returning 0
// from the paused sys_ipc_recv system call. (Hint: does the
// sys_ipc_recv function ever actually return?)
//
// If the sender wants to send a page but the receiver isn't asking for one,
// then no page mapping is transferred, but no error occurs.
// The ipc only happens when no errors occur.
//
// Returns 0 on success, < 0 on error.
// Errors are:
// -E_BAD_ENV if environment envid doesn't currently exist.
// (No need to check permissions.)
// -E_IPC_NOT_RECV if envid is not currently blocked in sys_ipc_recv,
// or another environment managed to send first.
// -E_INVAL if srcva < UTOP but srcva is not page-aligned.
// -E_INVAL if srcva < UTOP and perm is inappropriate
// (see sys_page_alloc).
// -E_INVAL if srcva < UTOP but srcva is not mapped in the caller's
// address space.
// -E_INVAL if (perm & PTE_W), but srcva is read-only in the
// current environment's address space.
// -E_NO_MEM if there's not enough memory to map srcva in envid's
// address space.
static int
sys_ipc_try_send(envid_t envid, uint32_t value, void *srcva, unsigned perm)
{
// LAB 4: Your code here.
struct Env* dstenv, * srcenv;
if(envid2env(envid, &dstenv, 0) || envid2env(0, &srcenv, 0))
return -E_BAD_ENV;
if(!dstenv->env_ipc_recving)
return -E_IPC_NOT_RECV;

dstenv->env_ipc_perm = 0;

if((uint32_t)srcva < UTOP){
pte_t *pte;
struct PageInfo *pp;
if(srcva != ROUNDDOWN(srcva, PGSIZE))
return -E_INVAL;
if((perm & (PTE_U | PTE_P)) != (PTE_U | PTE_P) || perm & (~PTE_SYSCALL))
return -E_INVAL;
if(!(pp = page_lookup(srcenv->env_pgdir, srcva, &pte)))
return -E_INVAL;
if((((*pte) & PTE_W) == 0) && (perm & PTE_W))
return -E_INVAL;
if(page_insert(dstenv->env_pgdir, pp, dstenv->env_ipc_dstva, perm))
return -E_NO_MEM;
dstenv->env_ipc_perm = perm;
}

dstenv->env_ipc_recving = false;
dstenv->env_ipc_value = value;
dstenv->env_ipc_from = srcenv->env_id;
dstenv->env_status = ENV_RUNNABLE;
dstenv->env_tf.tf_regs.reg_eax = 0;

return 0;
}

在lib/ipc.c当中要提供用户态可用的进行send和recv操作的接口。两个函数的相同之处在于如果没有传递地址映射的话,那么要讲地址设置成一个UTOP上方的值。

这里ipc_recv()只要根据返回值r进行两种情况的区分即可:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Receive a value via IPC and return it.
// If 'pg' is nonnull, then any page sent by the sender will be mapped at
// that address.
// If 'from_env_store' is nonnull, then store the IPC sender's envid in
// *from_env_store.
// If 'perm_store' is nonnull, then store the IPC sender's page permission
// in *perm_store (this is nonzero iff a page was successfully
// transferred to 'pg').
// If the system call fails, then store 0 in *fromenv and *perm (if
// they're nonnull) and return the error.
// Otherwise, return the value sent by the sender
//
// Hint:
// Use 'thisenv' to discover the value and who sent it.
// If 'pg' is null, pass sys_ipc_recv a value that it will understand
// as meaning "no page". (Zero is not the right value, since that's
// a perfectly valid place to map a page.)
int32_t
ipc_recv(envid_t *from_env_store, void *pg, int *perm_store)
{
// LAB 4: Your code here.
int r;

r = sys_ipc_recv(pg ? pg : (void *)UTOP);
if(r){
if(from_env_store)
*from_env_store = 0;
if(perm_store)
*perm_store = 0;
return r;
}
else{
if(from_env_store)
*from_env_store = thisenv->env_ipc_from;
if(perm_store)
*perm_store = thisenv->env_ipc_perm;
return thisenv->env_ipc_value;
}
return 0;
}

而对于ipc_send()则是通过一个循环来不断地尝试发送信息,为了防止一直占用CPU,每次循环中都会调用sys_yield()主动让出。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Send 'val' (and 'pg' with 'perm', if 'pg' is nonnull) to 'toenv'.
// This function keeps trying until it succeeds.
// It should panic() on any error other than -E_IPC_NOT_RECV.
//
// Hint:
// Use sys_yield() to be CPU-friendly.
// If 'pg' is null, pass sys_ipc_try_send a value that it will understand
// as meaning "no page". (Zero is not the right value.)
void
ipc_send(envid_t to_env, uint32_t val, void *pg, int perm)
{
// LAB 4: Your code here.
int r;
do{
sys_yield();
r = sys_ipc_try_send(to_env, val, pg ? pg : (void *)UTOP, perm);
if(r != 0 && r != -E_IPC_NOT_RECV)
panic("ipc_send: faild, %e", r);
}while(r);
}

实现完了之后lab4的基础内容就已经结束了,执行make grade可以得到如下的输出:

1
2
3
4
5
6
7
8
9
10
11
spin: OK (1.8s) 
stresssched: OK (3.2s)
sendpage: OK (0.9s)
(Old jos.out.sendpage failure log removed)
pingpong: OK (1.9s)
(Old jos.out.pingpong failure log removed)
primes: OK (9.1s)
(Old jos.out.primes failure log removed)
Part C score: 25/25

Score: 80/80

看到三部分都可以拿到全部分数。

Challenge 6: sfork()

这一个challenge所要完成的是一个共享除了栈之外所有的地址空间的fork操作,记为sfork()

首先实现了一个sduppage()函数,所做的是将父进程的地址映射给复制到子进程上,对于权限并不做修改,可以看做只是在sys_page_map()的基础上的封装。

1
2
3
4
5
6
7
8
9
10
11
12
13
static int
sduppage(envid_t envid, unsigned pn)
{
int r;

pte_t pte = uvpt[pn];
void * addr = (void *)(pn * PGSIZE);

uint32_t perm = pte & 0xFFF;
if((r = sys_page_map(0, addr, envid, addr, perm & PTE_SYSCALL))<0)
panic("sduppage: %e", r);
return 0;
}

之后就是sfork()函数的实现,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
int
sfork(void)
{
int r;
envid_t envid;
uint8_t * addr;
set_pgfault_handler(pgfault);
envid = sys_exofork();
if(envid < 0)
panic("fork: sys_exofork failed!");
if(envid == 0){
thisenv = &envs[ENVX(sys_getenvid())];
return 0;
}

bool in_stack = true;
for(addr = (uint8_t *)(USTACKTOP - PGSIZE); addr >= (uint8_t *)UTEXT; addr -= PGSIZE){
if((uvpd[PDX(addr)] & PTE_P) && (uvpt[PGNUM(addr)] & PTE_P)){
if(in_stack)
duppage(envid, PGNUM(addr));
else
sduppage(envid, PGNUM(addr));
}
else
in_stack = false;
}

if((r = sys_page_alloc(envid, (void *)(UXSTACKTOP-PGSIZE), PTE_W|PTE_P|PTE_U))<0)
panic("fork: sys_page_alloc failed, %e", r);

extern void _pgfault_upcall();
if((r = sys_env_set_pgfault_upcall(envid, _pgfault_upcall)))
panic("fork: sys_env_set_pgfault_upcall failed, %e", r);

if((r = sys_env_set_status(envid, ENV_RUNNABLE))<0)
panic("fork: sys_env_set_status failed, %e", r);

return envid;
}

可以看到与之前所实现的fork()函数的主体相对比,区别只存在16-26行。这里从栈底往下来进行duppage的操作,当超过栈顶之后,in_stack会被设置成false,之后就是共享的地址空间,全部调用sduppage()

到这里都是没有什么问题的,重点在于如何将thisenv这个全局变量能够使得不同的进程都能够得到其自身对应的Env结构体,否则的话,在sfork()的过程中,子进程会修改thisenv的指针。导致无论在父进程还是子进程当中,thisenv指向的都是子进程!

最为简单的想法就是通过sys_getenvid()系统调用得到envid,之后查找对应的Env结构体,由于两个进程共享地址空间,所以利用全局变量是不太方便的,一个简单方法是利用宏进行实现。

考虑第一种解决方案:

1
#define thisenv ((const volatile struct Env *)(&envs[ENVX(sys_getenvid())]))

这种写法可以完成需求,但是他是一个地址,而在libmain()当中以及fork()当中都有对于thisenv进行初始化的操作,这样需要进行额外的代码修改。

第二种解决方案:

1
2
extern const volatile struct Env *realenv;
#define thisenv (realenv = (const volatile struct Env *)(&envs[ENVX(sys_getenvid())])), realenv

利用逗号进行分隔,首先进行一个赋值操作,然后提供一个可以作为运算左值的对象,问题在于thisenv会被用作是cprintf()当中的参数,而逗号分隔会使得参数数量改变。

第三种解决方案:

1
2
extern const volatile struct Env *realenv;
#define thisenv ((const volatile struct Env *)*((realenv = (const volatile struct Env *)(&envs[ENVX(sys_getenvid())])), &realenv))

由于C中的逗号表达式以及赋值表达式所返回的都是值而不是对象,所以用先取地址再解引用的方式可以获得一个能作为运算左值的对象。这种方式理论上是没有问题的,但是由于当中会进行赋值操作,所以编译器会认为可能会导致结果出现偏差,会报warning。编译方式将warning视作error,所以这行不通。

最终采用的解决方案为利用一个新的指针数组存下所有Env结构体的地址,然后采用类似第一种解决方案的操作,不过得到的是一个可以作为赋值左值的对象。在inc/lib.c当中,添加关于penvs指针数组的声明,以及将thisenv作为一个宏进行声明。

1
2
3
4
extern const volatile struct Env *penvs[NENV];
extern const volatile struct Env envs[NENV];
extern const volatile struct PageInfo pages[];
# define thisenv penvs[ENVX(sys_getenvid())]

在lib/libmain.c当中声明penvs数组,并将其初始化。

1
2
3
4
5
6
7
8
9
10
const volatile struct Env * penvs[NENV];

//extern const volatile struct Env *thisenv;

void
libmain(int argc, char **argv)
{
int i;
for(i = 0; i < NENV; ++i)
penvs[i] = &envs[i];

在这样的操作下thisenv就可以完美兼容所有代码当中的情况了,不需要修改其他任何的实现。

执行pingpongs.c可以得到如下的输出:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
enabled interrupts: 1 2
[00000000] new env 00001000
[00001000] new env 00001001
i am 00001000; thisenv is 0xeec00000
send 0 from 1000 to 1001
1001 got 0 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 1 from 1001 (thisenv is 0xeec00000 1000)
1001 got 2 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 3 from 1001 (thisenv is 0xeec00000 1000)
1001 got 4 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 5 from 1001 (thisenv is 0xeec00000 1000)
1001 got 6 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 7 from 1001 (thisenv is 0xeec00000 1000)
1001 got 8 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 9 from 1001 (thisenv is 0xeec00000 1000)
[00001000] exiting gracefully
[00001000] free env 00001000
1001 got 10 from 1000 (thisenv is 0xeec0007c 1001)
[00001001] exiting gracefully
[00001001] free env 00001001

可以发现实际上两个进程确实是共享了地址空间,并且thisenv能够正确的指向进程自身了。

如果将其中的sfork()修改成fork()的话,得到的输出如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
enabled interrupts: 1 2
[00000000] new env 00001000
[00001000] new env 00001001
i am 00001000; thisenv is 0xeec00000
send 0 from 1000 to 1001
1001 got 0 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 0 from 1001 (thisenv is 0xeec00000 1000)
1001 got 1 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 1 from 1001 (thisenv is 0xeec00000 1000)
1001 got 2 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 2 from 1001 (thisenv is 0xeec00000 1000)
1001 got 3 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 3 from 1001 (thisenv is 0xeec00000 1000)
1001 got 4 from 1000 (thisenv is 0xeec0007c 1001)
1000 got 4 from 1001 (thisenv is 0xeec00000 1000)
...

不同进程中的val值是不会共享的,综上测试可以说明sfork()的实现没有问题。

原文

Denis came to Nastya and discovered that she was not happy to see him… There is only one chance that she can become happy. Denis wants to buy all things that Nastya likes so she will certainly agree to talk to him.

The map of the city where they live has a lot of squares, some of which are connected by roads. There is exactly one way between each pair of squares which does not visit any vertex twice. It turns out that the graph of the city is a tree.

Denis is located at vertex 1 at the time 0. He wants to visit every vertex at least once and get back as soon as possible.

Denis can walk one road in 1 time. Unfortunately, the city is so large that it will take a very long time to visit all squares. Therefore, Denis took a desperate step. He pulled out his pocket time machine, which he constructed in his basement. With its help, Denis can change the time to any non-negative time, which is less than the current time.

But the time machine has one feature. If the hero finds himself in the same place and at the same time twice, there will be an explosion of universal proportions and Nastya will stay unhappy. Therefore, Denis asks you to find him a route using a time machine that he will get around all squares and will return to the first and at the same time the maximum time in which he visited any square will be minimal.

Formally, Denis’s route can be represented as a sequence of pairs:{v1,t1},{v2,t2},{v3,t3},,{vk,tk}\{v_1,t_1\},\{v_2,t_2\},\{v_3,t_3\},\ldots,\{v_k,t_k\}, where viv_i is number of square, and tit_i is time in which the boy is now.

The following conditions must be met:

  • The route starts on square 1 at time 0, i.e. v1=1,t1=0v_1=1,t_1=0 and ends on the square 1, i.e. vk=1v_k=1.
  • All transitions are divided into two types:
    1. Being in the square change the time: {vi,ti}{vi+1,ti+1}:vi+1=vi\{v_i,t_i\}\rightarrow\{v_{i+1},t_{i+1}\}:v_{i+1}=v_i, 0ti+1<ti\ 0 \le t_{i+1}<t_i.
    2. Walk along one of the roads: {vi,ti}{vi+1,ti+1}\{v_i,t_i\}\rightarrow\{v_{i+1},t_{i+1}\}. Herewith, viv_i and vi+1v_{i+1} are connected by road, and ti+1=ti+1t_{i+1}=t_i+1
  • All pairs {vi,ti}\{v_i,t_i\} must be different.
  • All squares are among v1,v2,,vkv_1,v_2,\ldots,v_k.

You need to find a route such that the maximum time in any square will be minimal, that is, the route for which max(t1,t2,,tk)\max(t_1,t_2,\ldots,t_k) will be the minimum possible.

Input

The first line contains a single integer nn (1n105)(1\le n\le 10^5) — the number of squares in the city.

The next n−1 lines contain two integers uu and vv (1v,un,uv)(1\le v,u\le n,u\ne v) - the numbers of the squares connected by the road.

It is guaranteed that the given graph is a tree.

Output

In the first line output the integer kk (1k106)(1\le k\le 10^6) — the length of the path of Denis.

In the next kk lines output pairs vi,tiv_i,t_i — pairs that describe Denis’s route (as in the statement).

All route requirements described in the statements must be met.

It is guaranteed that under given restrictions there is at least one route and an answer whose length does not exceed 10610^6. If there are several possible answers, print any.

Example

input

1
2
3
4
5
5
1 2
2 3
2 4
4 5

output

1
2
3
4
5
6
7
8
9
10
11
12
13
14
13
1 0
2 1
3 2
3 1
2 2
4 3
4 1
5 2
5 1
4 2
2 3
2 0
1 1

题意

一个树形的结构,每次可以往相邻节点移动,需要花费一单位时间,或者是修改时间为任意非负值,但是不能够两次在同一时间到达同一节点({idx,t}\{idx,t\}的组合必须是惟一的)。问从根节点出发,最后回到根节点所需要的最大时间的最小的时候的遍历方法。如果有多个方法只需要输出任何一种。

思路

可以知道一个事情,遍历整棵树的最大时间,至少为当中所有节点的最大度数:maxv=1ndegv=T\max_{v=1}^n \deg v = T

考虑到任何节点,需要通过该节点degv1\deg v-1次来遍历所有邻居然后还需要再通过一次就才能够返回该节点的祖先。那么只要能够构造一种遍历的方法,使得可以满足最大时间为TT,即满足了题意。考虑对任何一个子树,到达子树uu的根节点父节点的时候为tt,子树一共有k=degu1k=\deg u -1个子节点:

情况1: t+1Tkt+1 \le T-k,在中途不需要进行时间回溯

(v,t)(u,t+1)(w1,t+2)(v,t)\rightarrow(u,t+1)\rightarrow(w_1,t+2)\rightarrow(u,t+2)\rightarrow(u,t+2) \rightarrow(wk,t+k+1)\rightarrow(w_k,t+k+1)\rightarrow(u,t+k)\rightarrow(u,t+k) \rightarrow (u,t)(u,t) \rightarrow (v,t+1)(v,t+1)

情况2:需要进行时间回溯

(v,t)(u,t+1)(w1,t+2)(v,t)\rightarrow(u,t+1)\rightarrow(w_1,t+2)\rightarrow(u,t+2)\rightarrow(u,t+2)\rightarrow\ldots\rightarrow(u,T)(u,T) (u,t)\rightarrow(u,t^\prime) \rightarrow\ldots(wk,t+k+1)\rightarrow(w_k,t+k+1) \rightarrow\ldots(u,t+k)\rightarrow(u,t+k)(u,t)(v,t+1)\rightarrow(u,t)\rightarrow(v,t+1)

在代码中backt就是返回父节点的时间,idx为当前的节点下标,t为当前时间节点,fa记录的是父节点的下标。整体进行一遍dfs遍历,按照上述的逻辑进行时间的修改。

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
#include <bits/stdc++.h>
using namespace std;

#define ll long long

int n;
vector<vector<int> > edges;
vector<pair<int,int> > ans;
int maxdeg = 0;

void dfs(int idx, int& t,int backt = -1, int fa = -1){
ans.push_back(make_pair(idx, t));
int cnt = edges[idx].size() - (fa == -1? 0: 1);
for(int to: edges[idx]){
if(to == fa)
continue;
if(t == maxdeg){ // go back
t = backt - cnt - 1;
ans.push_back(make_pair(idx, t));
}
t++;
dfs(to, t, t, idx);
ans.push_back(make_pair(idx, t));
--cnt;
}
if(fa == -1) // root
return;
if(t >= backt){
t = backt - 1;
ans.push_back(make_pair(idx, t));
}
++t;
}

int main(){
ios::sync_with_stdio(0);
cin.tie(0);
cout.tie(0);

cin>>n;
edges = vector<vector<int> >(n+1, vector<int>());
for(int i=1;i<n;++i){
int s, e;
cin>>s>>e;
edges[s].push_back(e);
edges[e].push_back(s);
}

for(int i=1;i<=n;++i)
maxdeg = max(maxdeg,(int)edges[i].size());

int t = 0;
dfs(1, t);
cout<< ans.size()<<endl;
for(auto temp: ans)
cout << temp.first <<' '<< temp.second <<endl;
}

原文

If the girl doesn’t go to Denis, then Denis will go to the girl. Using this rule, the young man left home, bought flowers and went to Nastya.

On the way from Denis’s house to the girl’s house is a road of nn lines. This road can’t be always crossed in one green light. Foreseeing this, the good mayor decided to place safety islands in some parts of the road. Each safety island is located after a line, as well as at the beginning and at the end of the road. Pedestrians can relax on them, gain strength and wait for a green light.

Denis came to the edge of the road exactly at the moment when the green light turned on. The boy knows that the traffic light first lights up gg seconds green, and then rr seconds red, then again gg seconds green and so on.

Formally, the road can be represented as a segment [0,n][0,n]. Initially, Denis is at point 00. His task is to get to point nn in the shortest possible time.

He knows many different integers d1,d2,,dmd_1,d_2,\ldots,d_m, where 0din0\le d_i\le n — are the coordinates of points, in which the safety islands are located. Only at one of these points, the boy can be at a time when the red light is on.

Unfortunately, Denis isn’t always able to control himself because of the excitement, so some restrictions are imposed:

  • He must always move while the green light is on because it’s difficult to stand when so beautiful girl is waiting for you. Denis can change his position by ±1\pm 1 in 11 second. While doing so, he must always stay inside the segment [0,n][0,n]
  • He can change his direction only on the safety islands (because it is safe). This means that if in the previous second the boy changed his position by +1+ 1 and he walked on a safety island, then he can change his position by ±1\pm 1. Otherwise, he can change his position only by +1+1. Similarly, if in the previous second he changed his position by 1-1, on a safety island he can change position by ±1\pm 1, and at any other point by 1- 1.
  • At the moment when the red light is on, the boy must be on one of the safety islands. He can continue moving in any direction when the green light is on.

Denis has crossed the road as soon as his coordinate becomes equal to nn.

This task was not so simple, because it’s possible that it is impossible to cross the road. Since Denis has all thoughts about his love, he couldn’t solve this problem and asked us to help him. Find the minimal possible time for which he can cross the road according to these rules, or find that it is impossible to do.

Input

The first line contains two integers nn and mm, (1n106,2mmin(n+1,104))(1\le n\le 10^6,2\le m\le \min(n+1,104)) — road width and the number of safety islands.

The second line contains mm distinct integers d1,d2,,dm (0din)d_1,d_2,\ldots,d_m\ (0\le d_i\le n) — the points where the safety islands are located. It is guaranteed that there are 00 and nn among them.

The third line contains two integers g,r (1g,r1000)g,r\ (1\le g,r\le 1000) — the time that the green light stays on and the time that the red light stays on.

Output

Output a single integer — the minimum time for which Denis can cross the road with obeying all the rules.

If it is impossible to cross the road output −1.

Examples

input

1
2
3
15 5
0 3 7 14 15
11 11

output

1
45

input

1
2
3
13 4
0 3 7 13
9 9

output

1
-1

Note

In the first test, the optimal route is:

  • for the first green light, go to 7 and return to 3. In this case, we will change the direction of movement at the point 7, which is allowed, since there is a safety island at this point. In the end, we will be at the point of 3, where there is also a safety island. The next 11 seconds we have to wait for the red light.
  • for the second green light reaches 14. Wait for the red light again.
  • for 11 second go to 15. As a result, Denis is at the end of the road.

In total, 45 seconds are obtained.

In the second test, it is impossible to cross the road according to all the rules.

大致意思

当绿灯亮起的时候可以不受限制的移动,但是绿灯结束的时候必须要停留在安全岛。只有在安全岛上可以进行方向的转换,问到达终点(第m个节点)所需要的最短时间是多少,如果不存在则输出-1。

思路

01-BFS,在同一个绿灯时间内的移动看做图上的边权重为0,利用push_front()否则的话认为边权重为1,利用push_back()。对于同一节点,距离绿灯开始时间相同的时刻,认为是图上相同节点。

总的时间复杂度为O(gm)O(g*m)

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
#include <iostream>
#include <algorithm>
#include <cmath>
#include <map>
#include <queue>
#include <vector>
#include <stack>
#include <deque>
#include <set>
#include <cstring>
#include <cstdlib>
#include <cstdio>
using namespace std;

#define ll long long

int n, m;
int dist[10010];
int g, r;
int dp[10010][1010];

int main(){
ios::sync_with_stdio(0);
cin>>n>>m;
for(int i=0;i<m;++i)
cin>>dist[i];
cin>>g>>r;
sort(dist, dist+m);

deque<pair<int,int> > q;
q.push_back(make_pair(0,0));
dp[0][0] = 1;
int ans = 0x7fffffff;
while(!q.empty()){
pair<int,int> temp = q.front();
int idx = temp.first;
int val = temp.second;
q.pop_front();
for(int i=-1;i<=1;i+=2){
int new_idx = idx+i;
if(new_idx<0 || new_idx>=m)
continue;
else if(new_idx == m-1){
int d = abs(dist[idx]-dist[new_idx]);
if(val + d <= g){
ans = min(ans, (dp[idx][val]-1)*(g+r)+val+d);
}
}
else{
int d = abs(dist[idx]-dist[new_idx]);
if(val + d == g && dp[new_idx][0] == 0){
dp[new_idx][0] = dp[idx][val] + 1;
q.push_back(make_pair(new_idx, 0));
}
else if(val + d < g && dp[new_idx][val+d] == 0){
dp[new_idx][val+d] = dp[idx][val];
q.push_front(make_pair(new_idx, val+d));
}
}
}
}
cout<<((ans == 0x7fffffff)? -1: ans)<<endl;
}

Exercise 1

如同Exercise描述中所说的,类似pages分配的方法一样利用boot_alloc()就好,由于后面有env_init()函数负责结构体的初始化,这个地方不需要进行初始化的操作。

1
2
3
// Make 'envs' point to an array of size 'NENV' of 'struct Env'.
// LAB 3: Your code here.
envs = (struct Env *) boot_alloc(sizeof(struct Env) * NENV);

同时采用相似的方法将其map到UENVS处:

1
2
3
4
5
6
7
8
9
// Map the 'envs' array read-only by the user at linear address UENVS
// (ie. perm = PTE_U | PTE_P).
// Permissions:
// - the new image at UENVS -- kernel R, user R
// - envs itself -- kernel RW, user NONE
// LAB 3: Your code here.
n = ROUNDUP(NENV*sizeof(struct Env), PGSIZE);
for (i = 0; i < n; i+=PGSIZE)
page_insert(kern_pgdir, pa2page(PADDR(envs) + i), (void *)(UENVS + i), PTE_U | PTE_P);

Exericse 2

env_init()

这里的操作实际上就是进行一个envs的初始化,最开始所有的都是空闲状态,将其插入env_free_list就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// Mark all environments in 'envs' as free, set their env_ids to 0,
// and insert them into the env_free_list.
// Make sure the environments are in the free list in the same order
// they are in the envs array (i.e., so that the first call to
// env_alloc() returns envs[0]).
//
void
env_init(void)
{
// Set up envs array
// LAB 3: Your code here.
memset(envs, 0, sizeof(struct Env) * NENV);

int i;
env_free_list = envs;
for(i = 1; i < NENV; ++i)
envs[i-1].env_link = envs + i;

// Per-CPU part of the initialization
env_init_percpu();
}

env_setup_vm()

这里可以知道,在UTOP下面应当是空白的,在UTOP上面都是相同的,所以首先对整个page进行清空,之后利用memcpy以kern_pgdir为模板,只需要进行page table的修改就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
//
// Initialize the kernel virtual memory layout for environment e.
// Allocate a page directory, set e->env_pgdir accordingly,
// and initialize the kernel portion of the new environment's address space.
// Do NOT (yet) map anything into the user portion
// of the environment's virtual address space.
//
// Returns 0 on success, < 0 on error. Errors include:
// -E_NO_MEM if page directory or table could not be allocated.
//
static int
env_setup_vm(struct Env *e)
{
int i;
struct PageInfo *p = NULL;

// Allocate a page for the page directory
if (!(p = page_alloc(ALLOC_ZERO)))
return -E_NO_MEM;

// Now, set e->env_pgdir and initialize the page directory.
//
// Hint:
// - The VA space of all envs is identical above UTOP
// (except at UVPT, which we've set below).
// See inc/memlayout.h for permissions and layout.
// Can you use kern_pgdir as a template? Hint: Yes.
// (Make sure you got the permissions right in Lab 2.)
// - The initial VA below UTOP is empty.
// - You do not need to make any more calls to page_alloc.
// - Note: In general, pp_ref is not maintained for
// physical pages mapped only above UTOP, but env_pgdir
// is an exception -- you need to increment env_pgdir's
// pp_ref for env_free to work correctly.
// - The functions in kern/pmap.h are handy.

// LAB 3: Your code here.
e->env_pgdir = page2kva(p);
++(p->pp_ref);
memset(e->env_pgdir, 0, PGSIZE);
memcpy(e->env_pgdir + PDX(UTOP), kern_pgdir + PDX(UTOP), PGSIZE - (PDX(UTOP)<<2));

// UVPT maps the env's own page table read-only.
// Permissions: kernel R, user R
e->env_pgdir[PDX(UVPT)] = PADDR(e->env_pgdir) | PTE_P | PTE_U;

return 0;
}

region_alloc()

这里可以仿照在pmap.c中多次实现的alloc操作,只不过这里的page是利用page_alloc()得到的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
//
// Allocate len bytes of physical memory for environment env,
// and map it at virtual address va in the environment's address space.
// Does not zero or otherwise initialize the mapped pages in any way.
// Pages should be writable by user and kernel.
// Panic if any allocation attempt fails.
//
static void
region_alloc(struct Env *e, void *va, size_t len)
{
// LAB 3: Your code here.
// (But only if you need it for load_icode.)
//
// Hint: It is easier to use region_alloc if the caller can pass
// 'va' and 'len' values that are not page-aligned.
// You should round va down, and round (va + len) up.
// (Watch out for corner-cases!)
void *start_va = ROUNDDOWN(va, PGSIZE);
void *end_va = ROUNDUP(va + len, PGSIZE);
void *cur_va;
for(cur_va = start_va; cur_va < end_va; cur_va += PGSIZE)
{
struct PageInfo * pp = page_alloc(0);
if(!pp)
panic("region_alloc: Out of memory!\n");
page_insert(e->env_pgdir, pp, (void *)cur_va, PTE_U | PTE_W);
}
}

load_icode()

先看看bootmain()当中是怎么操作的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void
bootmain(void)
{
struct Proghdr *ph, *eph;

// read 1st page off disk
readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);

// is this a valid ELF?
if (ELFHDR->e_magic != ELF_MAGIC)
goto bad;

// load each program segment (ignores ph flags)
ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
eph = ph + ELFHDR->e_phnum;
for (; ph < eph; ph++)
// p_pa is the load address of this segment (as well
// as the physical address)
readseg(ph->p_pa, ph->p_memsz, ph->p_offset);

// call the entry point from the ELF header
// note: does not return!
((void (*)(void)) (ELFHDR->e_entry))();

bad:
outw(0x8A00, 0x8A00);
outw(0x8A00, 0x8E00);
while (1)
/* do nothing */;
}

仿照第13-19行进行下面的code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
//
// Set up the initial program binary, stack, and processor flags
// for a user process.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
//
// This function loads all loadable segments from the ELF binary image
// into the environment's user memory, starting at the appropriate
// virtual addresses indicated in the ELF program header.
// At the same time it clears to zero any portions of these segments
// that are marked in the program header as being mapped
// but not actually present in the ELF file - i.e., the program's bss section.
//
// All this is very similar to what our boot loader does, except the boot
// loader also needs to read the code from disk. Take a look at
// boot/main.c to get ideas.
//
// Finally, this function maps one page for the program's initial stack.
//
// load_icode panics if it encounters problems.
// - How might load_icode fail? What might be wrong with the given input?
//
static void
load_icode(struct Env *e, uint8_t *binary)
{
// Hints:
// Load each program segment into virtual memory
// at the address specified in the ELF segment header.
// You should only load segments with ph->p_type == ELF_PROG_LOAD.
// Each segment's virtual address can be found in ph->p_va
// and its size in memory can be found in ph->p_memsz.
// The ph->p_filesz bytes from the ELF binary, starting at
// 'binary + ph->p_offset', should be copied to virtual address
// ph->p_va. Any remaining memory bytes should be cleared to zero.
// (The ELF header should have ph->p_filesz <= ph->p_memsz.)
// Use functions from the previous lab to allocate and map pages.
//
// All page protection bits should be user read/write for now.
// ELF segments are not necessarily page-aligned, but you can
// assume for this function that no two segments will touch
// the same virtual page.
//
// You may find a function like region_alloc useful.
//
// Loading the segments is much simpler if you can move data
// directly into the virtual addresses stored in the ELF binary.
// So which page directory should be in force during
// this function?
//
// You must also do something with the program's entry point,
// to make sure that the environment starts executing there.
// What? (See env_run() and env_pop_tf() below.)

// LAB 3: Your code here.
struct Elf * ELFHDR = (struct Elf *)binary;
struct Proghdr * ph, * eph;
lcr3(PADDR(e->env_pgdir));
ph = (struct Proghdr *)(binary + ELFHDR->e_phoff);
eph = ph + ELFHDR->e_phnum;
for(; ph < eph; ++ph)
{
if(ph->p_type == ELF_PROG_LOAD)
{
region_alloc(e, (void *)ph->p_va, ph->p_memsz);
memset((void *)ph->p_va, 0, ph->p_memsz);
memcpy((void *)ph->p_va, binary + ph->p_offset, ph->p_filesz);
}
}
e->env_tf.tf_eip = ELFHDR->e_entry;

lcr3(PADDR(kern_pgdir));
// Now map one page for the program's initial stack
// at virtual address USTACKTOP - PGSIZE.

// LAB 3: Your code here.
region_alloc(e, (void *)(USTACKTOP - PGSIZE), PGSIZE);
}

env_create()

直接调用env_alloc()load_icode()就可以了,可以看做是在上面的一层封装:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
//
// Allocates a new env with env_alloc, loads the named elf
// binary into it with load_icode, and sets its env_type.
// This function is ONLY called during kernel initialization,
// before running the first user-mode environment.
// The new env's parent ID is set to 0.
//
void
env_create(uint8_t *binary, enum EnvType type)
{
// LAB 3: Your code here.
struct Env * e;
if(env_alloc(&e, 0))
panic("env_create: env alloc failed!\n");
load_icode(e, binary);
e->env_type = type;
}

env_run()

要完成的是进行一个上下文的切换,这里主要做的就是首先对于env需要进行状态的改变,之后需要进行地址空间的切换。同时利用已经存在的env_pop_tf()函数来进行寄存器的恢复,具体的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
//
// Context switch from curenv to env e.
// Note: if this is the first call to env_run, curenv is NULL.
//
// This function does not return.
//
void
env_run(struct Env *e)
{
// Step 1: If this is a context switch (a new environment is running):
// 1. Set the current environment (if any) back to
// ENV_RUNNABLE if it is ENV_RUNNING (think about
// what other states it can be in),
// 2. Set 'curenv' to the new environment,
// 3. Set its status to ENV_RUNNING,
// 4. Update its 'env_runs' counter,
// 5. Use lcr3() to switch to its address space.
// Step 2: Use env_pop_tf() to restore the environment's
// registers and drop into user mode in the
// environment.

// Hint: This function loads the new environment's state from
// e->env_tf. Go back through the code you wrote above
// and make sure you have set the relevant parts of
// e->env_tf to sensible values.

// LAB 3: Your code here.

if(curenv != NULL && curenv->env_status == ENV_RUNNING)
curenv->env_status = ENV_RUNNABLE;
curenv = e;
e->env_status = ENV_RUNNING;
++(e->env_runs);
lcr3(PADDR(e->env_pgdir));

env_pop_tf(&(e->env_tf));

//panic("env_run not yet implemented");
}

gdb对hello进行断点调试

在obj/user/hello.asm里面可以看到

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void
sys_cputs(const char *s, size_t len)
{
800a1c: 55 push %ebp
800a1d: 89 e5 mov %esp,%ebp
800a1f: 57 push %edi
800a20: 56 push %esi
800a21: 53 push %ebx
//
// The last clause tells the assembler that this can
// potentially change the condition codes and arbitrary
// memory locations.

asm volatile("int %1\n"
800a22: b8 00 00 00 00 mov $0x0,%eax
800a27: 8b 4d 0c mov 0xc(%ebp),%ecx
800a2a: 8b 55 08 mov 0x8(%ebp),%edx
800a2d: 89 c3 mov %eax,%ebx
800a2f: 89 c7 mov %eax,%edi
800a31: 89 c6 mov %eax,%esi
800a33: cd 30 int $0x30

对应的地址为0x800a33,利用gdb进行断点设置:

1
2
3
4
5
6
7
(gdb) b *0x800a33
Breakpoint 2 at 0x800a33
(gdb) c
Continuing.
=> 0x800a33: int $0x30

Breakpoint 2, 0x00800a33 in ?? ()

发现确实执行到了这一条指令,以上的实现应该是没有问题。

Exercise 3

内容为阅读Chapter 9,是关于Exceptions和Interrupts的内容。

Exercise 4

从inc/trap.h当中可以发现,TrapFrame有着如下的结构:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
struct Trapframe {
struct PushRegs tf_regs;
uint16_t tf_es;
uint16_t tf_padding1;
uint16_t tf_ds;
uint16_t tf_padding2;
uint32_t tf_trapno;
/* below here defined by x86 hardware */
uint32_t tf_err;
uintptr_t tf_eip;
uint16_t tf_cs;
uint16_t tf_padding3;
uint32_t tf_eflags;
/* below here only when crossing rings, such as from user to kernel */
uintptr_t tf_esp;
uint16_t tf_ss;
uint16_t tf_padding4;
} __attribute__((packed));

可以知道对于剩下的就是要保存%es和%ds,来使得最终结构为一个Trapframe,剩下的按照Exercise的描述操作就可以了,得到_alltraps的结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
/*
* Lab 3: Your code here for _alltraps
*/
.global _alltraps
_alltraps:
pushl %ds
pushl %es
pushal
pushl $GD_KD
popl %ds
pushl $GD_KD
popl %es
pushl %esp
call trap

利用已经存在的TRAPHANDLERTRAPHANDLER_NOEC宏可以来生成handler的入口,只需要区分有没有错误码就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
/*
* Lab 3: Your code here for generating entry points for the different traps.
*/

TRAPHANDLER_NOEC(handler_divide, T_DIVIDE)
TRAPHANDLER_NOEC(handler_debug, T_DEBUG)
TRAPHANDLER_NOEC(handler_nmi, T_NMI)
TRAPHANDLER_NOEC(handler_brkpt, T_BRKPT)
TRAPHANDLER_NOEC(handler_oflow, T_OFLOW)
TRAPHANDLER_NOEC(handler_bound, T_BOUND)
TRAPHANDLER_NOEC(handler_illop, T_ILLOP)
TRAPHANDLER_NOEC(handler_device, T_DEVICE)
TRAPHANDLER(handler_dblflt, T_DBLFLT)
TRAPHANDLER(handler_tss, T_TSS)
TRAPHANDLER(handler_segnp, T_SEGNP)
TRAPHANDLER(handler_stack, T_STACK)
TRAPHANDLER(handler_gpflt, T_GPFLT)
TRAPHANDLER(handler_pgflt, T_PGFLT)
TRAPHANDLER_NOEC(handler_fperr, T_FPERR)
TRAPHANDLER(handler_align, T_ALIGN)
TRAPHANDLER_NOEC(handler_mchk, T_MCHK)
TRAPHANDLER_NOEC(handler_simderr, T_SIMDERR)
TRAPHANDLER_NOEC(handler_syscall, T_SYSCALL)
TRAPHANDLER_NOEC(handler_default, T_DEFAULT)

通过查询80386手册的9.10可以看到如下关于error code的总结:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Description                       Interrupt     Error Code
Number

Divide error 0 No
Debug exceptions 1 No
Breakpoint 3 No
Overflow 4 No
Bounds check 5 No
Invalid opcode 6 No
Coprocessor not available 7 No
System error 8 Yes (always 0)
Coprocessor Segment Overrun 9 No
Invalid TSS 10 Yes
Segment not present 11 Yes
Stack exception 12 Yes
General protection fault 13 Yes
Page fault 14 Yes
Coprocessor error 16 No
Two-byte SW interrupt 0-255 No

在inc/mmu.h当中可以看到有关SETGATE的描述:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Set up a normal interrupt/trap gate descriptor.
// - istrap: 1 for a trap (= exception) gate, 0 for an interrupt gate.
// see section 9.6.1.3 of the i386 reference: "The difference between
// an interrupt gate and a trap gate is in the effect on IF (the
// interrupt-enable flag). An interrupt that vectors through an
// interrupt gate resets IF, thereby preventing other interrupts from
// interfering with the current interrupt handler. A subsequent IRET
// instruction restores IF to the value in the EFLAGS image on the
// stack. An interrupt through a trap gate does not change IF."
// - sel: Code segment selector for interrupt/trap handler
// - off: Offset in code segment for interrupt/trap handler
// - dpl: Descriptor Privilege Level -
// the privilege level required for software to invoke
// this interrupt/trap gate explicitly using an int instruction.
#define SETGATE(gate, istrap, sel, off, dpl) \
{ \
(gate).gd_off_15_0 = (uint32_t) (off) & 0xffff; \
(gate).gd_sel = (sel); \
(gate).gd_args = 0; \
(gate).gd_rsv1 = 0; \
(gate).gd_type = (istrap) ? STS_TG32 : STS_IG32; \
(gate).gd_s = 0; \
(gate).gd_dpl = (dpl); \
(gate).gd_p = 1; \
(gate).gd_off_31_16 = (uint32_t) (off) >> 16; \
}

之后再trap_init()当中进行这样的填充,要注意到断点和系统调用的dpl需要设置为3(用户):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// LAB 3: Your code here.
void handler_divide();
SETGATE(idt[T_DIVIDE], 0, GD_KT, handler_divide, 0);
void handler_debug();
SETGATE(idt[T_DEBUG], 0, GD_KT, handler_debug, 0);
void handler_nmi();
SETGATE(idt[T_NMI], 0, GD_KT, handler_nmi, 0);
void handler_brkpt();
SETGATE(idt[T_BRKPT], 0, GD_KT, handler_brkpt, 3);
void handler_oflow();
SETGATE(idt[T_OFLOW], 0, GD_KT, handler_oflow, 0);
void handler_bound();
SETGATE(idt[T_BOUND], 0, GD_KT, handler_bound, 0);
void handler_illop();
SETGATE(idt[T_ILLOP], 0, GD_KT, handler_illop, 0);
void handler_device();
SETGATE(idt[T_DEVICE], 0, GD_KT, handler_device, 0);
void handler_dblflt();
SETGATE(idt[T_DBLFLT], 0, GD_KT, handler_dblflt, 0);
void handler_tss();
SETGATE(idt[T_TSS], 0, GD_KT, handler_tss, 0);
void handler_segnp();
SETGATE(idt[T_SEGNP], 0, GD_KT, handler_segnp, 0);
void handler_stack();
SETGATE(idt[T_STACK], 0, GD_KT, handler_stack, 0);
void handler_gpflt();
SETGATE(idt[T_GPFLT], 0, GD_KT, handler_gpflt, 0);
void handler_pgflt();
SETGATE(idt[T_PGFLT], 0, GD_KT, handler_pgflt, 0);
void handler_fperr();
SETGATE(idt[T_FPERR], 0, GD_KT, handler_fperr, 0);
void handler_align();
SETGATE(idt[T_ALIGN], 0, GD_KT, handler_align, 0);
void handler_mchk();
SETGATE(idt[T_MCHK], 0, GD_KT, handler_mchk, 0);
void handler_simderr();
SETGATE(idt[T_SIMDERR], 0, GD_KT, handler_simderr, 0);
void handler_syscall();
SETGATE(idt[T_SYSCALL], 1, GD_KT, handler_syscall, 3);
void handler_default();
SETGATE(idt[T_DEFAULT], 0, GD_KT, handler_default, 0);

利用make grade可以得到下面的输出:

1
2
3
4
5
6
7
divzero: OK (1.8s)
(Old jos.out.divzero failure log removed)
softint: OK (1.7s)
(Old jos.out.softint failure log removed)
badsegment: OK (2.1s)
(Old jos.out.badsegment failure log removed)
Part A score: 30/30

说明这里Part A的实现没有问题。

Question

  1. 如果所有的exception/interrupt都通过同样一个handler,那么就没有办法知道是通过哪一个中断进来的,不能设置对应的中断号,后面不能进行分发。

  2. 除了系统调用门,其他的特权级都设置成0,这里int $14本来应当触发page fault,但是这个时候权限不对,所以会触发general protection fault。如果允许他能够触发page fault的话,那么者会造成安全隐患。

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    [00000000] new env 00001000
    Incoming TRAP frame at 0xefffffbc
    TRAP frame at 0xf0226000
    edi 0x00000000
    esi 0x00000000
    ebp 0xeebfdfd0
    oesp 0xefffffdc
    ebx 0x00000000
    edx 0x00000000
    ecx 0x00000000
    eax 0x00000000
    es 0x----0023
    ds 0x----0023
    trap 0x0000000d General Protection
    err 0x00000072
    eip 0x00800037
    cs 0x----001b
    flag 0x00000046
    esp 0xeebfdfd0
    ss 0x----0023
    [00001000] free env 00001000

    当允许触发page fault的时候,可以看到保存的内容如下

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    [00000000] new env 00001000
    Incoming TRAP frame at 0xefffffc0
    TRAP frame at 0xefffffc0
    edi 0x00000000
    esi 0x00000000
    ebp 0xeebfdfd0
    oesp 0xefffffe0
    ebx 0x00000000
    edx 0x00000000
    ecx 0x00000000
    eax 0x00000000
    es 0x----0023
    ds 0x----0023
    trap 0x0000000e Page Fault
    cr2 0x00000000
    err 0x00800039 [kernel, read, protection]
    eip 0x0000001b
    cs 0x----0046
    flag 0xeebfdfd0
    esp 0x00000023
    ss 0x----ff53
    [00001000] free env 00001000

    在这里我的想法是第15行往后全部进行了一位的上移,可以看到之后的err当中的内容实际上应该是eip,eip的内容实际上是cs,以此类推。应该是cr2或者err code没有进行压入导致的。

Exercise 5

只需要在trap_dispatch()当中添加一个分支即可:

1
2
3
4
5
if(tf->tf_trapno == T_PGFLT)
{
page_fault_handler(tf);
return;
}

Exercise 6

和Exercise 5相同,只需要在trap_dispatch()里面添加一个分支:

1
2
3
4
5
if(tf->tf_trapno == T_BRKPT)
{
monitor(tf);
return;
}

Question

  1. 在于前面Exercise 4中的设置:

    1
    SETGATE(idt[T_BRKPT], 0, GD_KT, handler_brkpt, 3);

    这里当最后的dpl设置为3的时候,会正确的触发为break point exception,当设置为0的时候,会触发为general protection fault。其原因在于,如果设置为0,会导致断点触发需要内核级的权限,因为权限不够从而触发GPF。

  2. 这个测试的目的主要是检查权限是否设置正确,需要正确的区分用户和内核,防止用户对于内核代码进行操作产生安全隐患。

Exercise 7

在kern/trap.c里面,同之前两个Exercise一样进行分发的设置:

1
2
3
4
5
6
7
if(tf->tf_trapno == T_SYSCALL)
{
tf->tf_regs.reg_eax = syscall(tf->tf_regs.reg_eax,
tf->tf_regs.reg_edx, tf->tf_regs.reg_ecx, tf->tf_regs.reg_ebx,
tf->tf_regs.reg_edi, tf->tf_regs.reg_esi);
return;
}

在kern/syscall.c当中,利用switch进行分发即可,注意不同系统调用的参数就可以了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// Dispatches to the correct kernel function, passing the arguments.
int32_t
syscall(uint32_t syscallno, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4, uint32_t a5)
{
// Call the function corresponding to the 'syscallno' parameter.
// Return any appropriate return value.
// LAB 3: Your code here.

//panic("syscall not implemented");

switch (syscallno) {
case SYS_cputs:
sys_cputs((const char *)a1, (size_t)a2);
return 0;

case SYS_cgetc:
return sys_cgetc();

case SYS_getenvid:
return sys_getenvid();

case SYS_env_destroy:
return sys_env_destroy((envid_t)a1);

case NSYSCALLS:
return 0;

default:
return -E_INVAL;
}
}

Exercise 8

在lib/libmain.c当中,进行env_id的指定:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void
libmain(int argc, char **argv)
{
// set thisenv to point at our Env structure in envs[].
// LAB 3: Your code here.
envid_t envid = sys_getenvid();
thisenv = &envs[ENVX(envid)];

// save the name of the program so that panic() can use it
if (argc > 0)
binaryname = argv[0];

// call user main routine
umain(argc, argv);

// exit gracefully
exit();
}

Exercise 9

在kern/trap.c当中的page_fault_handler()函数当中,利用tf_cs来判断是不是在kernel-mode,如果是直接触发一个panic:

1
2
3
4
// Handle kernel-mode page faults.
// LAB 3: Your code here.
if(((tf->tf_cs)&3) == 0)
panic("page fault: happen in kernel mode! %08x\n", tf->tf_cs);

在kern/pmap.c当中,采用一个for循环对虚拟地址区间进行权限的检查,具体内容遵循注释就可以。当检查没有问题的时候返回值为0,否则返回值为-E_FAULT。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
//
// Check that an environment is allowed to access the range of memory
// [va, va+len) with permissions 'perm | PTE_P'.
// Normally 'perm' will contain PTE_U at least, but this is not required.
// 'va' and 'len' need not be page-aligned; you must test every page that
// contains any of that range. You will test either 'len/PGSIZE',
// 'len/PGSIZE + 1', or 'len/PGSIZE + 2' pages.
//
// A user program can access a virtual address if (1) the address is below
// ULIM, and (2) the page table gives it permission. These are exactly
// the tests you should implement here.
//
// If there is an error, set the 'user_mem_check_addr' variable to the first
// erroneous virtual address.
//
// Returns 0 if the user program can access this range of addresses,
// and -E_FAULT otherwise.
//
int
user_mem_check(struct Env *env, const void *va, size_t len, int perm)
{
// LAB 3: Your code here.
int newperm = perm | PTE_P;
uint32_t cur_addr;
pte_t * pte;
for(cur_addr = (uint32_t)va; cur_addr < (uint32_t)(va + len); cur_addr = ROUNDDOWN((cur_addr+PGSIZE),PGSIZE))
{
if(cur_addr >= ULIM)
{
user_mem_check_addr = cur_addr;
return -E_FAULT;
}
pte = pgdir_walk(env->env_pgdir, (void *)cur_addr, 0);
if((!pte) || ((*pte) & newperm) != newperm){
user_mem_check_addr = cur_addr;
return -E_FAULT;
}
}
return 0;
}

需要注意的是要在kern/syscall.c当中需要填充上有关检查的部分!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// Print a string to the system console.
// The string is exactly 'len' characters long.
// Destroys the environment on memory errors.
static void
sys_cputs(const char *s, size_t len)
{
// Check that the user has permission to read memory [s, s+len).
// Destroy the environment if not.

// LAB 3: Your code here.
user_mem_assert(curenv, s, len, PTE_W);

// Print the string supplied by the user.
cprintf("%.*s", len, s);
}

如果这里没有进行user_mem_assert()的话,执行buggyhello会进入系统调用然后在内核态触发page fault。

之后为backtrace相关的内容,在kern/kdebug.c当中添加有关usd,stabs,stabstr的检查如下,这里注意user_mem_check()当正常的时候返回值为0:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
else {
// The user-application linker script, user/user.ld,
// puts information about the application's stabs (equivalent
// to __STAB_BEGIN__, __STAB_END__, __STABSTR_BEGIN__, and
// __STABSTR_END__) in a structure located at virtual address
// USTABDATA.
const struct UserStabData *usd = (const struct UserStabData *) USTABDATA;

// Make sure this memory is valid.
// Return -1 if it is not. Hint: Call user_mem_check.
// LAB 3: Your code here.
if(user_mem_check(curenv, (void *)usd, sizeof(struct UserStabData), PTE_U))
return -1;

stabs = usd->stabs;
stab_end = usd->stab_end;
stabstr = usd->stabstr;
stabstr_end = usd->stabstr_end;

// Make sure the STABS and string table memory is valid.
// LAB 3: Your code here.
if(user_mem_check(curenv, (void *)stabs, (uint32_t)stabs-(uint32_t)stab_end, PTE_U))
return -1;
if(user_mem_check(curenv, (void *)stabstr, (uint32_t)stabstr_end-(uint32_t)stabstr, PTE_U))
return -1;
}

在执行breakpoint之后,利用backtrace得到的结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
K> backtrace
Stack backtrace:
ebp efffff10 eip f01010d6 args 00000001 efffff28 f0228000 00000000 f01e6a40
kern/monitor.c:448: monitor+260
ebp efffff80 eip f01048ca args f0228000 efffffbc 00000000 00000000 00000000
kern/trap.c:195: trap+180
ebp efffffb0 eip f0104a43 args efffffbc 00000000 00000000 eebfdfd0 efffffdc
kern/trapentry.S:85: <unknown>+0
ebp eebfdfd0 eip 0080007b args 00000000 00000000 00000000 00000000 00000000
lib/libmain.c:27: libmain+63
ebp eebfdff0 eip 00800031 args 00000000 00000000Incoming TRAP frame at 0xeffffec
kernel panic at kern/trap.c:270: page fault: happen in kernel mode! 00000008

可以看到最后为lib/libmain.c,并且最终在内核态发生了page fault。可以发现,这个地方args在输出到第三个参数的时候突然触发,那应该是从ebp向上读取args触发的page fault。

结合mom_backtrace()的实现如下,应该是在13行的语句处出现的错误:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
cprintf("Stack backtrace:\n");
uint32_t* ebp = (uint32_t*)read_ebp();
struct Eipdebuginfo info;
while(ebp){
cprintf("ebp %08x ",ebp);
cprintf("eip %08x ",ebp[1]);
cprintf("args");
int i;
for(i=2;i<=6;++i)
cprintf(" %08x",ebp[i]);
cprintf("\n");

Exercise 10

运行evilhello可以看到如下的输出:

1
2
3
4
5
6
7
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
Incoming TRAP frame at 0xefffffbc
[00001000] user_mem_check assertion failure for va f010000c
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!
Welcome to the JOS kernel monitor!

用户环境被销毁了,并且kernel没有panic,说明行为符合预期。

使用make grade命令可以得到如下结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
divzero: OK (1.4s)
softint: OK (1.4s)
badsegment: OK (2.0s)
Part A score: 30/30

faultread: OK (1.9s)
faultreadkernel: OK (1.6s)
faultwrite: OK (0.9s)
faultwritekernel: OK (1.6s)
breakpoint: OK (2.0s)
testbss: OK (2.1s)
hello: OK (1.8s)
buggyhello: OK (1.7s)
buggyhello2: OK (0.8s)
evilhello: OK (1.6s)
Part B score: 50/50

Score: 80/80

说明lab3的内容已经被完成了。

Challenge 1

参考了github上https://github.com/SimpCosm/6.828/tree/master/lab3的实现。

其中TRAPHANDLER和TRAPHANDLER_NOEC的主要区别就在于有没有压入error code,这里采用一个if语句来进行判断:

1
2
3
4
5
6
7
8
9
10
11
12
13
#define GENERALHANDLER(name, num)	\
.data; \
.long name; \
.text; \
.globl name; \
.type name, @function; \
.align 2; \
name: \
.if !(num == 8 || (num >= 10 && num <= 14) || num == 17 ); \
pushl $0; \
.endif; \
pushl $(num); \
jmp _alltraps

之后构建一个数组vectors用来保存相关函数,就可以采用脚本语言批量生成重复代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
.data
.globl vectors
vectors:
GENERALHANDLER(handler0, 0)
GENERALHANDLER(handler1, 1)
GENERALHANDLER(handler2, 2)
GENERALHANDLER(handler3, 3)
GENERALHANDLER(handler4, 4)
GENERALHANDLER(handler5, 5)
GENERALHANDLER(handler6, 6)
GENERALHANDLER(handler7, 7)
GENERALHANDLER(handler8, 8)
GENERALHANDLER(handler9, 9)
GENERALHANDLER(handler10, 10)
GENERALHANDLER(handler11, 11)
GENERALHANDLER(handler12, 12)
GENERALHANDLER(handler13, 13)
GENERALHANDLER(handler14, 14)
GENERALHANDLER(handler15, 15)
GENERALHANDLER(handler16, 16)
GENERALHANDLER(handler17, 17)
GENERALHANDLER(handler18, 18)
GENERALHANDLER(handler19, 19)
GENERALHANDLER(handler20, 20)
GENERALHANDLER(handler21, 21)
GENERALHANDLER(handler22, 22)
GENERALHANDLER(handler23, 23)
GENERALHANDLER(handler24, 24)
GENERALHANDLER(handler25, 25)
GENERALHANDLER(handler26, 26)
GENERALHANDLER(handler27, 27)
GENERALHANDLER(handler28, 28)
GENERALHANDLER(handler29, 29)
GENERALHANDLER(handler30, 30)
GENERALHANDLER(handler31, 31)
GENERALHANDLER(handler32, 32)
GENERALHANDLER(handler33, 33)
GENERALHANDLER(handler34, 34)
GENERALHANDLER(handler35, 35)
GENERALHANDLER(handler36, 36)
GENERALHANDLER(handler37, 37)
GENERALHANDLER(handler38, 38)
GENERALHANDLER(handler39, 39)
GENERALHANDLER(handler40, 40)
GENERALHANDLER(handler41, 41)
GENERALHANDLER(handler42, 42)
GENERALHANDLER(handler43, 43)
GENERALHANDLER(handler44, 44)
GENERALHANDLER(handler45, 45)
GENERALHANDLER(handler46, 46)
GENERALHANDLER(handler47, 47)
GENERALHANDLER(handler48, 48)
GENERALHANDLER(handler49, 49)
GENERALHANDLER(handler50, 50)
GENERALHANDLER(handler51, 51)
GENERALHANDLER(handler52, 52)
GENERALHANDLER(handler53, 53)

之后就可以在kern/trap.c中对trap_init()采用循环构建入口,节省大量代码,对于特殊的可以单独提出来进行构造:

1
2
3
4
5
int i;
for(i = 0; i < 54; ++i)
SETGATE(idt[i], 0, GD_KT, vectors[i], 0);
SETGATE(idt[T_BRKPT], 0, GD_KT, vectors[T_BRKPT], 3);
SETGATE(idt[T_SYSCALL], 1, GD_KT, vectors[T_SYSCALL], 3);

在完成以上的修改之后,通过make grade仍然可以得到80分结果,说明没有问题:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
divzero: OK (1.4s)
softint: OK (1.4s)
badsegment: OK (1.6s)
Part A score: 30/30

faultread: OK (0.9s)
faultreadkernel: OK (1.5s)
faultwrite: OK (2.0s)
faultwritekernel: OK (1.6s)
breakpoint: OK (0.9s)
(Old jos.out.breakpoint failure log removed)
testbss: OK (1.5s)
hello: OK (1.6s)
buggyhello: OK (1.0s)
buggyhello2: OK (1.4s)
evilhello: OK (1.6s)
Part B score: 50/50

Score: 80/80

Challenge 2

Intel手册中12.3.1.4节为关于单步调试的相关内容:

This debug condition occurs at the end of an instruction if the trap flag (TF) of the flags register held the value one at the beginning of that instruction. Note that the exception does not occur at the end of an instruction that sets TF. For example, if POPF is used to set TF, a single-step trap does not occur until after the instruction that follows POPF.

意思就是设置了TF之后,执行完下一条命令会触发一个DEBUG。于是可以照如下写continue和stepi指令:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int
mon_continue(int argc, char **argv, struct Trapframe *tf)
{
if(!tf)
return 0;
tf->tf_eflags &= ~FL_TF;
return -1;
}

int
mon_stepi(int argc, char **argv, struct Trapframe *tf)
{
if(!tf)
return 0;
tf->tf_eflags |= FL_TF;
return -1;
}

这里需要注意第六行,如果在continue里面不进行eflags维护单纯返回的话,会导致在执行了stepi指令之后,continue指令无效的情况。

同时为了使得stepi触发DEBUG之后能够回到monitor,需要在trap_dispatch()当中添加关于T_DEBUG的分发:

1
2
3
4
5
if(tf->tf_trapno == T_BRKPT || tf->tf_trapno == T_DEBUG)
{
monitor(tf);
return;
}

此时利用continue可以在断点程序之后继续执行,效果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
Incoming TRAP frame at 0xefffffbc
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
TRAP frame at 0xf0228000
edi 0x00000000
esi 0x00000000
ebp 0xeebfdfd0
oesp 0xefffffdc
ebx 0x00000000
edx 0x00000000
ecx 0x00000000
eax 0xeec00000
es 0x----0023
ds 0x----0023
trap 0x00000003 Breakpoint
err 0x00000000
eip 0x00800038
cs 0x----001b
flag 0x00000046
esp 0xeebfdfd0
ss 0x----0023
K> continue
Incoming TRAP frame at 0xefffffbc
[00001000] exiting gracefully
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
K>

Challenge 3

从lab中所给的链接可以找到rdmsrwrmsr的宏定义:

1
2
3
4
5
6
7
8
9
#define rdmsr(msr,val1,val2) \
__asm__ __volatile__("rdmsr" \
: "=a" (val1), "=d" (val2) \
: "c" (msr))

#define wrmsr(msr,val1,val2) \
__asm__ __volatile__("wrmsr" \
: /* no outputs */ \
: "c" (msr), "a" (val1), "d" (val2))

从IA32的手册当中可以找到在使用SYSENTER之前所需要设置的相关内容:

  • IA32_SYSENTER_CS (MSR address 174H) — The lower 16 bits of this MSR are the segment selector for the privilege level 0 code segment. This value is also used to determine the segment selector of the privilege level 0 stack segment (see the Operation section). This value cannot indicate a null selector.
  • IA32_SYSENTER_EIP (MSR address 176H) — The value of this MSR is loaded into RIP (thus, this value references the first instruction of the selected operating procedure or routine). In protected mode, only bits 31:0 are loaded.
  • IA32_SYSENTER_ESP (MSR address 175H) — The value of this MSR is loaded into RSP (thus, this value contains the stack pointer for the privilege level 0 stack). This value cannot represent a non-canonical address. In protected mode, only bits 31:0 are loaded.

添加一个sysenter_init()并且在trap_init()内进行调用来实现初始化:

1
2
3
4
5
6
7
8
void
sysenter_init(void)
{
wrmsr(0x174, GD_KT, 0);
wrmsr(0x176, syscall_fast, 0);
wrmsr(0x175, KSTACKTOP, 0);
return;
}

在lib/syscall.c里面需要仿照syscall写一个syscall_fast,与syscall不同的是,这里采用sysenter而不是int 0x30。同时需要将%esi保存为sysenter之后的位置,并且push和pop保存%ebp。这里参数同syscall相似,只是少了最后一个a5。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
static inline int32_t
syscall_fast(int num, int check, uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4)
{
int32_t ret;
asm volatile(
"leal .after_sysenter_label, %%esi\n"
"push %%ebp\n"
"movl %%esp, %%ebp\n"
"sysenter\n"
".after_sysenter_label: popl %%ebp\n"
: "=a" (ret)
: "a" (num),
"d" (a1),
"c" (a2),
"b" (a3),
"D" (a4)
:);

if(check && ret > 0)
panic("syscall %d returned %d (> 0)", num, ret);

return ret;
}

在kern/syscall.c中要写一个handler用来处理对应的系统调用,这个就是之前在init里面所填充的入口,流程为从保存的寄存器中取得参数,执行相应的内容,结束之后将返回值保存并利用sysexit返回(这里如果参数不都采用"=m"的约束会出现bug,原因未知):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
void
syscall_fast(void)
{
uint32_t syscallno, a1, a2, a3, a4, ret;
uint32_t eip, esp;

asm volatile(
"mov %%eax, %0\n"
"mov %%edx, %1\n"
"mov %%ecx, %2\n"
"mov %%ebx, %3\n"
"mov %%edi, %4\n"
"mov %%esi, %5\n"
"mov (%%ebp), %6\n"
:"=r" (syscallno),
"=m" (a1),
"=m" (a2),
"=m" (a3),
"=m" (a4),
"=r" (eip),
"=r" (esp)
);

switch (syscallno) {
case SYS_cputs:
sys_cputs((const char *)a1, (size_t)a2);
ret = 0;
break;

case SYS_cgetc:
ret = sys_cgetc();
break;

case SYS_getenvid:
ret = sys_getenvid();
break;

case SYS_env_destroy:
ret = sys_env_destroy((envid_t)a1);
break;

case NSYSCALLS:
ret = 0;
break;

default:
panic("syscall_fast: wrong syscallno\n");
}

asm volatile(
"sysexit\n"
:
: "a" (ret),
"d" (eip),
"c" (esp)
:);
}

在lib/syscall.c中修改sys_cputs()让其调用syscall_fast()进行测试(实际上就是丢掉最后一个传入的参数就可以了):

1
2
3
4
5
6
void
sys_cputs(const char *s, size_t len)
{
syscall_fast(SYS_cputs, 0, (uint32_t)s, len, 0, 0);
//syscall(SYS_cputs, 0, (uint32_t)s, len, 0, 0, 0);
}

执行hello能够得到的输出如下:

1
2
3
4
5
6
7
8
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
hello, world
i am environment 00001000
Incoming TRAP frame at 0xefffffbc
[00001000] exiting gracefully
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!

对比原来的输出:

1
2
3
4
5
6
7
8
9
10
[00000000] new env 00001000
Incoming TRAP frame at 0xefffffbc
Incoming TRAP frame at 0xefffffbc
hello, world
Incoming TRAP frame at 0xefffffbc
i am environment 00001000
Incoming TRAP frame at 0xefffffbc
[00001000] exiting gracefully
[00001000] free env 00001000
Destroyed the only environment - nothing more to do!

可以发现由于在进行系统调用的时候没有采用int 0x30,所以这里在每次输出前并没有都进入trap()函数,使得少去了两行Incoming TRAP frame at ....的输出。

替换后使用make grade也能够得到满分80分,至少在这个lab中采用sysenter不会有问题。

Before start

以下是memlayout.h中对于虚拟地址空间布局的描述:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
/*
* Virtual memory map: Permissions
* kernel/user
*
* 4 Gig --------> +------------------------------+
* | | RW/--
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* : . :
* : . :
* : . :
* |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| RW/--
* | | RW/--
* | Remapped Physical Memory | RW/--
* | | RW/--
* KERNBASE, ----> +------------------------------+ 0xf0000000 --+
* KSTACKTOP | CPU0's Kernel Stack | RW/-- KSTKSIZE |
* | - - - - - - - - - - - - - - -| |
* | Invalid Memory (*) | --/-- KSTKGAP |
* +------------------------------+ |
* | CPU1's Kernel Stack | RW/-- KSTKSIZE |
* | - - - - - - - - - - - - - - -| PTSIZE
* | Invalid Memory (*) | --/-- KSTKGAP |
* +------------------------------+ |
* : . : |
* : . : |
* MMIOLIM ------> +------------------------------+ 0xefc00000 --+
* | Memory-mapped I/O | RW/-- PTSIZE
* ULIM, MMIOBASE --> +------------------------------+ 0xef800000
* | Cur. Page Table (User R-) | R-/R- PTSIZE
* UVPT ----> +------------------------------+ 0xef400000
* | RO PAGES | R-/R- PTSIZE
* UPAGES ----> +------------------------------+ 0xef000000
* | RO ENVS | R-/R- PTSIZE
* UTOP,UENVS ------> +------------------------------+ 0xeec00000
* UXSTACKTOP -/ | User Exception Stack | RW/RW PGSIZE
* +------------------------------+ 0xeebff000
* | Empty Memory (*) | --/-- PGSIZE
* USTACKTOP ---> +------------------------------+ 0xeebfe000
* | Normal User Stack | RW/RW PGSIZE
* +------------------------------+ 0xeebfd000
* | |
* | |
* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* . .
* . .
* . .
* |~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
* | Program Data & Heap |
* UTEXT --------> +------------------------------+ 0x00800000
* PFTEMP -------> | Empty Memory (*) | PTSIZE
* | |
* UTEMP --------> +------------------------------+ 0x00400000 --+
* | Empty Memory (*) | |
* | - - - - - - - - - - - - - - -| |
* | User STAB Data (optional) | PTSIZE
* USTABDATA ----> +------------------------------+ 0x00200000 |
* | Empty Memory (*) | |
* 0 ------------> +------------------------------+ --+
*
* (*) Note: The kernel ensures that "Invalid Memory" is *never* mapped.
* "Empty Memory" is normally unmapped, but user programs may map pages
* there if desired. JOS user programs map pages temporarily at UTEMP.
*/

总体的虚拟内存布局是如上的一个状态。

在inc/mmu.h文件的注释中可以看到对于线性地址的结构描述如下,是按二级页表的方式进行地址转换的。前十位是一级页表的索引,中间十位是二级页表索引,最后的12位表示的是4K页面内部的偏移量。

1
2
3
4
5
6
7
8
// A linear address 'la' has a three-part structure as follows:
//
// +--------10------+-------10-------+---------12----------+
// | Page Directory | Page Table | Offset within Page |
// | Index | Index | |
// +----------------+----------------+---------------------+
// \--- PDX(la) --/ \--- PTX(la) --/ \---- PGOFF(la) ----/
// \---------- PGNUM(la) ----------/

在inc/memlayout.h中可以看到PageInfo的结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
struct PageInfo {
// Next page on the free list.
struct PageInfo *pp_link;

// pp_ref is the count of pointers (usually in page table entries)
// to this page, for pages allocated using page_alloc.
// Pages allocated at boot time using pmap.c's
// boot_alloc do not have valid reference count fields.

uint16_t pp_ref;
};

其中pp_link链接的是free list当中下一个空闲的页面,而pp_ref表示的是指向该页面的指针的个数,当清零的时候说明页面就没有被指向了。在全局是利用一个PageInfo的数组来存放物理页面状态:

1
extern struct PageInfo *pages;

可以发现页面是通过一个PageInfo类型进行描述,指针与pages地址的差值就是页面号,物理地址就直接是一个32-bit的整数,相互转换依照上方的三级结构进行:

1
2
3
4
5
6
7
8
9
10
11
12
13
static inline physaddr_t
page2pa(struct PageInfo *pp)
{
return (pp - pages) << PGSHIFT;
}

static inline struct PageInfo*
pa2page(physaddr_t pa)
{
if (PGNUM(pa) >= npages)
panic("pa2page called with invalid pa");
return &pages[PGNUM(pa)];
}

所以以上两个函数提供了一个在page和物理地址之间进行相互转换的方式。

Exercise 1

boot_alloc

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// This simple physical memory allocator is used only while JOS is setting
// up its virtual memory system. page_alloc() is the real allocator.
//
// If n>0, allocates enough pages of contiguous physical memory to hold 'n'
// bytes. Doesn't initialize the memory. Returns a kernel virtual address.
//
// If n==0, returns the address of the next free page without allocating
// anything.
//
// If we're out of memory, boot_alloc should panic.
// This function may ONLY be used during initialization,
// before the page_free_list list has been set up.
static void *
boot_alloc(uint32_t n)
{
static char *nextfree; // virtual address of next byte of free memory
char *result;

// Initialize nextfree if this is the first time.
// 'end' is a magic symbol automatically generated by the linker,
// which points to the end of the kernel's bss segment:
// the first virtual address that the linker did *not* assign
// to any kernel code or global variables.
if (!nextfree) {
extern char end[];
nextfree = ROUNDUP((char *) end, PGSIZE);
}

// Allocate a chunk large enough to hold 'n' bytes, then update
// nextfree. Make sure nextfree is kept aligned
// to a multiple of PGSIZE.
//
// LAB 2: Your code here.
if(npages * PGSIZE < (uint32_t)(nextfree + n - KERNBASE)) // out of memory
panic("boot_alloc: We are out of memory.\n");
result = nextfree;
nextfree = ROUNDUP(nextfree + n, PGSIZE);

return result;
}

从34行开始为补充的代码部分,nextfree作为static变量只会初始化一次,用来表示往后第一个没有被分配的virtual address。

由于在注释中要求要对于对于out of memory的情况需要触发panic,所以这里在34行进行了一个分配内容是否超过物理内存限制的检查。如果一切正常的话就进行分配,采用已经定义好的ROUNDUP宏来进行页面对齐。

如果n为0的时候,37行代码不会产生任何改变,符合注释中所描述的代码逻辑。

mem_init

1
2
3
4
5
6
7
8
9
// Allocate an array of npages 'struct PageInfo's and store it in 'pages'.
// The kernel uses this array to keep track of physical pages: for
// each physical page, there is a corresponding struct PageInfo in this
// array. 'npages' is the number of physical pages in memory. Use memset
// to initialize all fields of each struct PageInfo to 0.
// Your code goes here:

pages = (struct PageInfo *) boot_alloc(sizeof(struct PageInfo) * npages);
memset(pages, 0, sizeof(struct PageInfo)*npages);

这里所需要进行的就是对于pages这样一个存储PageInfo的数组进行空间分配,并且初始化为0。首先先利用boot_alloc进行空间的分配,之后利用memset进行清零就可以了。

这里可以看到,pages实际上就是在页目录之后进行分配的一串物理地址。

page_init

在memlayout.h中可以看到

1
2
3
4
5
// At IOPHYSMEM (640K) there is a 384K hole for I/O.  From the kernel,
// IOPHYSMEM can be addressed at KERNBASE + IOPHYSMEM. The hole ends
// at physical address EXTPHYSMEM.
#define IOPHYSMEM 0x0A0000
#define EXTPHYSMEM 0x100000

IOPHYSMEM对应的是640K的位置,EXTPHYSMEM对应的是1M的位置。在lab1当中内核代码就是被加载到了1M的后面,之后再之前的mem_init()当中,我们又在上面进行了pages的分配,当前可以使用的free空间应当是从之前分配的内容后面开始。boot_alloc()返回的是一个kernel virtual address,需要将其转换得到对应的physical address

从pmap.h文件中可以看到从PA向KVA的转换如下:

1
2
3
4
5
6
7
8
9
10
11
/* This macro takes a physical address and returns the corresponding kernel
* virtual address. It panics if you pass an invalid physical address. */
#define KADDR(pa) _kaddr(__FILE__, __LINE__, pa)

static inline void*
_kaddr(const char *file, int line, physaddr_t pa)
{
if (PGNUM(pa) >= npages)
_panic(file, line, "KADDR called with invalid pa %08lx", pa);
return (void *)(pa + KERNBASE);
}

那么从KVA向PA的转换只需要进行一个逆操作。

最后的代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
//
// Initialize page structure and memory free list.
// After this is done, NEVER use boot_alloc again. ONLY use the page
// allocator functions below to allocate and deallocate physical
// memory via the page_free_list.
//
void
page_init(void)
{
// The example code here marks all physical pages as free.
// However this is not truly the case. What memory is free?
// 1) Mark physical page 0 as in use.
// This way we preserve the real-mode IDT and BIOS structures
// in case we ever need them. (Currently we don't, but...)
// 2) The rest of base memory, [PGSIZE, npages_basemem * PGSIZE)
// is free.
// 3) Then comes the IO hole [IOPHYSMEM, EXTPHYSMEM), which must
// never be allocated.
// 4) Then extended memory [EXTPHYSMEM, ...).
// Some of it is in use, some is free. Where is the kernel
// in physical memory? Which pages are already in use for
// page tables and other data structures?
//
// Change the code to reflect this.
// NB: DO NOT actually touch the physical memory corresponding to
// free pages!
size_t i;
uint32_t pa_free_start = (uint32_t)((char *)boot_alloc(0) - KERNBASE);
// case 1:
pages[0].pp_ref = 1;
pages[0].pp_link = NULL;
// case 2, 3, 4:
for (i = 1; i < npages; i++) {
if(IOPHYSMEM <= i * PGSIZE && i * PGSIZE < pa_free_start)
{
pages[i].pp_ref = 1;
pages[i].pp_link = NULL;
}
else
{
pages[i].pp_ref = 0;
pages[i].pp_link = page_free_list;
page_free_list = &pages[i];
}
}
}

pa_free_start表示,这个物理地址后面的空间在当前都是空闲的,所以3和4的一部分都需要被设置成已经分配的状态。剩下的内容都被设置成空闲页面,加入到page_free_list当中。

page_alloc

这里完成的是对于页面的分配,根据alloc_flags来判断是否对于页面进行初始化。如果进行分配的话,那么就将page_free_list的头一个页面取出进行分配即可,初始化利用page2kva得到对应的地址,然后进行初始化操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
//
// Allocates a physical page. If (alloc_flags & ALLOC_ZERO), fills the entire
// returned physical page with '\0' bytes. Does NOT increment the reference
// count of the page - the caller must do these if necessary (either explicitly
// or via page_insert).
//
// Be sure to set the pp_link field of the allocated page to NULL so
// page_free can check for double-free bugs.
//
// Returns NULL if out of free memory.
//
// Hint: use page2kva and memset
struct PageInfo *
page_alloc(int alloc_flags)
{
// Fill this function in
struct PageInfo* alloc_page = page_free_list;
if(alloc_page == NULL)
return NULL;
page_free_list = alloc_page->pp_link;
alloc_page->pp_link = NULL;
if(alloc_flags && ALLOC_ZERO)
memset(page2kva(alloc_page), 0, PGSIZE);
return alloc_page;
}

page_free

这里做的操作是释放页面,将页面插入page_free_list的头部就可以,但是首先需要检查是否pp_ref为0且pp_link为_NULL,前者不为0表示对于仍在使用的页面进行了释放的操作,后者不为NULL说明它本身就已经是被释放的页面,进行了double free的操作,都要触发panic。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//
// Return a page to the free list.
// (This function should only be called when pp->pp_ref reaches 0.)
//
void
page_free(struct PageInfo *pp)
{
// Fill this function in
// Hint: You may want to panic if pp->pp_ref is nonzero or
// pp->pp_link is not NULL.
if(pp->pp_ref != 0 || pp->pp_link != NULL)
panic("page_free: pp_ref or pp_link is not zero!\n");
pp->pp_link = page_free_list;
page_free_list = pp;
return;
}

到这里可以得到如下的输出内容

1
2
check_page_free_list() succeeded!
check_page_alloc() succeeded!

说明关于page_free_list的维护和page_alloc的操作都没有问题。

Exercise 2

主要是关于Intel 80386手册的描述,其中第五章主要是对分段机制的描述,第六章是对分页机制的描述。由于在JOS当中将整个空间看做一个段,所以段偏移量就是线性地址,只需要明白关于分页机制以及对应的线性地址转换到物理地址的过程就好。

Exercise 3

主要是GDB和QEMU命令的熟悉。

Exercise 4

pgdir_walk

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Given 'pgdir', a pointer to a page directory, pgdir_walk returns
// a pointer to the page table entry (PTE) for linear address 'va'.
// This requires walking the two-level page table structure.
//
// The relevant page table page might not exist yet.
// If this is true, and create == false, then pgdir_walk returns NULL.
// Otherwise, pgdir_walk allocates a new page table page with page_alloc.
// - If the allocation fails, pgdir_walk returns NULL.
// - Otherwise, the new page's reference count is incremented,
// the page is cleared,
// and pgdir_walk returns a pointer into the new page table page.
//
// Hint 1: you can turn a PageInfo * into the physical address of the
// page it refers to with page2pa() from kern/pmap.h.
//
// Hint 2: the x86 MMU checks permission bits in both the page directory
// and the page table, so it's safe to leave permissions in the page
// directory more permissive than strictly necessary.
//
// Hint 3: look at inc/mmu.h for useful macros that manipulate page
// table and page directory entries.
//
pte_t *
pgdir_walk(pde_t *pgdir, const void *va, int create)
{
// Fill this function in
uint32_t pdx = PDX(va);
uint32_t ptx = PTX(va);
if(pgdir[pdx] == 0)
{
if(create)
{
struct PageInfo* newpte = page_alloc(1);
if(newpte == NULL)
return NULL;
++(newpte->pp_ref);
pgdir[pdx] = page2pa(newpte) | PTE_P | PTE_W | PTE_U;
}
else
return NULL;
}
physaddr_t pte = PTE_ADDR(pgdir[pdx]) | (ptx << 2);
return KADDR(pte);
}

pgdir_walk()函数做的内容实际上是通过虚拟地址va来进行一个地址翻译,找到所对应的pte。这里传入的三个参数,pgdir是一个指向页目录基址的指针,va是要进行翻译的虚拟地址,create是一个标志,如果非0说明对于对应的va不存在pte的话需要进行分配。

那么首先检查就是页目录中对应的页表到底存在不存在,如果存在的话,直接取出然后进行pte的计算。那么如果不存在的话,就需要page_alloc来分配一个物理页面用来存储页表,并且将该物理页面的引用添加,之后由于关于权限的确认在后面的pte项当中也会进行,所以这里关于页表就可以直接提供全部的权限,将其填入对应的页目录的项中。

那创建了页表之后,就如同之前一样进行进一步的地址转换。但是由于这里物理地址是不能直接进行解引用操作的,所以利用KADDR宏将得到的物理地址转换成remap过的虚拟地址,这样可以通过解引用来获得对应的物理地址也能对于所存储的内容进行修改。

boot_map_region

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
//
// Map [va, va+size) of virtual address space to physical [pa, pa+size)
// in the page table rooted at pgdir. Size is a multiple of PGSIZE, and
// va and pa are both page-aligned.
// Use permission bits perm|PTE_P for the entries.
//
// This function is only intended to set up the ``static'' mappings
// above UTOP. As such, it should *not* change the pp_ref field on the
// mapped pages.
//
// Hint: the TA solution uses pgdir_walk
static void
boot_map_region(pde_t *pgdir, uintptr_t va, size_t size, physaddr_t pa, int perm)
{
// Fill this function in
while(size > 0)
{
pte_t* pte = pgdir_walk(pgdir, (void *)va, 1);
if(pte == NULL)
panic("boot_map_region: Fail to alloc new page, run out of memory!\n");
*pte = pa | perm | PTE_P;
size -= PGSIZE;
va += PGSIZE, pa += PGSIZE;
}
}

这个函数的作用是将一串连续的虚拟地址映射到一串连续的物理地址,其中映射的地址的大小是页面大小的整数倍。那么可以知道,直接的想法就是通过虚拟地址进行地址查询,然后将页表中对应的表项修改为映射到的物理地址就可以了。那么以每个页面单位来进行这样的操作。

首先通过pgdir_walk()来找到虚拟地址对应的表项,如果对应的二级页表不存在那么就进行空间的分配,如果分配失败则进行报错,出发一个panic。

之后就将物理地址以及对应的权限填到表项里面,然后对下一个需要映射的页进行相同的操作。

page_lookup

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
//
// Return the page mapped at virtual address 'va'.
// If pte_store is not zero, then we store in it the address
// of the pte for this page. This is used by page_remove and
// can be used to verify page permissions for syscall arguments,
// but should not be used by most callers.
//
// Return NULL if there is no page mapped at va.
//
// Hint: the TA solution uses pgdir_walk and pa2page.
//
struct PageInfo *
page_lookup(pde_t *pgdir, void *va, pte_t **pte_store)
{
// Fill this function in
pte_t* pte = pgdir_walk(pgdir, va, 0);
if(pte == NULL)
return NULL;
physaddr_t pa = PTE_ADDR(*pte);
if(pte_store)
*pte_store = pte;
return pa2page(pa);
}

这个地方的page_lookup()想要做的是通过虚拟地址va来查找对应的映射页的PageInfo结构,这边的操作就是首先去找pte,如果找到说明该虚拟地址被映射到了一个页面,得到映射页面的物理页面首地址,再通过pa2page()完成转换。那么如果没有找到的话,说明这个虚拟地址并没有映射到任何物理页面。如果传入的pte_store非空的话,就将其进行保存。

page_remove

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
//
// Unmaps the physical page at virtual address 'va'.
// If there is no physical page at that address, silently does nothing.
//
// Details:
// - The ref count on the physical page should decrement.
// - The physical page should be freed if the refcount reaches 0.
// - The pg table entry corresponding to 'va' should be set to 0.
// (if such a PTE exists)
// - The TLB must be invalidated if you remove an entry from
// the page table.
//
// Hint: The TA solution is implemented using page_lookup,
// tlb_invalidate, and page_decref.
//
void
page_remove(pde_t *pgdir, void *va)
{
// Fill this function in
pte_t* pte_store;
struct PageInfo* pp = page_lookup(pgdir, va, &pte_store);
if(pp == NULL)
return;
*pte_store = 0;
page_decref(pp);
tlb_invalidate(pgdir, va);
}

page_remove()所做的操作是将va映射到的物理页面给取消映射。要完成remove的操作需要做两件事情,一个就是将页表项中的对应内容给修改,另外一个就是对于PageInfo结构的修改,需要将其引用数减少,如果引用数为0,那么就将其加入空闲链表。在25行处的page_decref()函数做的实际上就是上述这个减少引用的操作。

那么一开始利用page_lookup()来找到对应的页面和pte,在24行修改pte,在25行修改链表结构,在26行调用tlb_invalidate()函数把TLB里面的内容给标注为无效。

page_insert

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
//
// Map the physical page 'pp' at virtual address 'va'.
// The permissions (the low 12 bits) of the page table entry
// should be set to 'perm|PTE_P'.
//
// Requirements
// - If there is already a page mapped at 'va', it should be page_remove()d.
// - If necessary, on demand, a page table should be allocated and inserted
// into 'pgdir'.
// - pp->pp_ref should be incremented if the insertion succeeds.
// - The TLB must be invalidated if a page was formerly present at 'va'.
//
// Corner-case hint: Make sure to consider what happens when the same
// pp is re-inserted at the same virtual address in the same pgdir.
// However, try not to distinguish this case in your code, as this
// frequently leads to subtle bugs; there's an elegant way to handle
// everything in one code path.
//
// RETURNS:
// 0 on success
// -E_NO_MEM, if page table couldn't be allocated
//
// Hint: The TA solution is implemented using pgdir_walk, page_remove,
// and page2pa.
//
int
page_insert(pde_t *pgdir, struct PageInfo *pp, void *va, int perm)
{
// Fill this function in
pte_t* pte = pgdir_walk(pgdir, va, 1);
if(pte == NULL)
return -E_NO_MEM;
physaddr_t pa = page2pa(pp);
++(pp->pp_ref);
if(*pte)
page_remove(pgdir, va);
*pte = pa | perm | PTE_P;
return 0;
}

page_insert()的操作就是将va映射到pp所指向的物理页面上去,而对应的权限通过perm来进行表示。那么利用pgdir_walk()函数来获得pte, 如果没有就进行创建。那么这个情况下如果返回为NULL,只有可能是空间不足无法创建,于是返回-E_NO_MEM。那么如果能够得到pte,就对于对应的物理页面进行处理,添加引用数,然后把本来pte可能存在的映射关系给消除,之后再进行映射。

这里存在的一个问题是,如果我这里提供的pp本来就是va映射的对象,可能会出现问题。考虑将34行进行引用数增加的内容移到36行后面,那么他首先进行了page_remove。如果之前的引用数为1,那这个页面将被加入空闲链表。而之后再给他加了一个引用数,这就相当于空闲链表中存在着不空闲的页面,他可能会被二次分配。存在一个bug。而将该语句保存在34行的位置,就可以确保remove之后,如果本来就是va映射的页面,也不会被加入到空闲链表中,规避了之前所说的那种bug的出现。

Exercise 5

1
2
3
4
5
6
7
8
9
// Map 'pages' read-only by the user at linear address UPAGES
// Permissions:
// - the new image at UPAGES -- kernel R, user R
// (ie. perm = PTE_U | PTE_P)
// - pages itself -- kernel RW, user NONE
// Your code goes here:
n = ROUNDUP(npages*sizeof(struct PageInfo), PGSIZE);
for (i = 0; i < n; i += PGSIZE)
page_insert(kern_pgdir, pa2page(PADDR(pages) + i), (void *)(UPAGES + i), PTE_U | PTE_P);

这里是要将pages个映射到UPAGES以上的内容,那么这里要考虑到pages这整个内容实际上是对应着许多PageInfo结构的,在进行映射的同时需要对于PageInfo内部的引用数进行修改,这里采用一个for循环将所有页面依次进行映射。权限位由于在注释中说明,需要内核和用户都可读,所以标注成PTE_U|PTE_P。

1
2
3
4
5
6
7
8
9
10
11
// Use the physical memory that 'bootstack' refers to as the kernel
// stack. The kernel stack grows down from virtual address KSTACKTOP.
// We consider the entire range from [KSTACKTOP-PTSIZE, KSTACKTOP)
// to be the kernel stack, but break this into two pieces:
// * [KSTACKTOP-KSTKSIZE, KSTACKTOP) -- backed by physical memory
// * [KSTACKTOP-PTSIZE, KSTACKTOP-KSTKSIZE) -- not backed; so if
// the kernel overflows its stack, it will fault rather than
// overwrite memory. Known as a "guard page".
// Permissions: kernel RW, user NONE
// Your code goes here:
boot_map_region(kern_pgdir, KSTACKTOP - KSTKSIZE, KSTKSIZE, PADDR(bootstack), PTE_W);

这里进行的是一个连续地址的映射,完成的是内核栈的一个映射。这个地方被划分成了[KSTACKTOP-KSTKSIZE, KSTACKTOP)[KSTACKTOP-PTSIZE, KSTACKTOP-KSTKSIZE)两个部分,通过注释内容可以知道,前一段是需要映射到物理地址的,后一段是不需要的。所以我们要做的只是将前一段进行映射。这里bootstack是已经知道的,通过PADDR将其转换为物理地址,然后映射以KSTACKTOP-KSTKSIZE为起点,KSTKSIZE大小的内容。权限由于在boot_map_region()当中会自动加上PTE_P,所以这里只要标注PTE_W。

1
2
3
4
5
6
7
8
9
// Map all of physical memory at KERNBASE.
// Ie. the VA range [KERNBASE, 2^32) should map to
// the PA range [0, 2^32 - KERNBASE)
// We might not have 2^32 - KERNBASE bytes of physical memory, but
// we just set up the mapping anyway.
// Permissions: kernel RW, user NONE
// Your code goes here:
//cprintf("kernbase: %x 2^32-kernbase: %x", KERNBASE, (~KERNBASE)+1);
boot_map_region(kern_pgdir, KERNBASE, (~KERNBASE) + 1, 0, PTE_W);

同样的是一个进行简单的连续地址映射的操作,那么这个地方也是采用boot_map_region()来进行,但是这里需要得到2^32,而32位大小是表示不出这么大的数的,所以这里采用(~KERNBASE)+1来得到需要进行映射的大小。这里的权限同样由于boot_map_region()会自动加上PTE_P,所以只需要标注PTE_W就可以了。

到这里为止,通过执行

1
make grade

可以得到如下的结果:

1
2
3
4
5
6
running JOS: (2.8s)
Physical page allocator: OK
Page management: OK
Kernel page directory: OK
Page management 2: OK
Score: 70/70

说明已经满足所有check函数的需求,完成了虚拟内存系统的一个初始化。

Questions:

这里的value应当是一个虚拟地址,在程序里面,并不能直接对于物理地址进行操控,所有的指针都应当是虚拟地址。

这里需要注意的是:JOS将从0开始的所有物理内存映射到虚拟地址0xf0000000就是为了让内核能够读写只知道物理地址的内容。那么为了完成从物理地址到虚拟地址的转换,对于只知道物理地址的,就将其物理地址加上0xf0000000,就可以得到对应的虚拟地址了。利用定义好的宏KADDA(pa)可以做到,而宏PADDA(va)就是这个的逆操作。在Exercise4当中这两个宏能够有效地进行虚拟地址物理地址之间的转换,从而使的解引用等操作可以进行执行。

  • 表格如下:
Entry Base Virtual Address Points to (logically)
960 0xf0000000 以上映射到物理地址从0开始的位置
959 0xefff8000 内核栈
958 0xef800000 页表(UVPT)
957 0xef400000 pages数组(UPAGES)
. . .
0 0x00000000 [see next question]
  • We have placed the kernel and user environment in the same address space. Why will user programs not be able to read or write the kernel’s memory? What specific mechanisms protect the kernel memory?

因为在表项中存在权限为,只有PTE_U被设置成1的时候才可以让user访问,kernel memory只需要修改权限为就可以不被user读写。

  • What is the maximum amount of physical memory that this operating system can support? Why?

在kern/pmap.c中的mem_init()函数中可以看到将pages数组映射到了线性地址的UPAGES上方。那么在inc/memlayout.h的图中可以看到,给只读的pages数组分配的空间为4M大小(一个PTSIZE)。

1
2
3
*    UVPT      ---->  +------------------------------+ 0xef400000
* | RO PAGES | R-/R- PTSIZE
* UPAGES ----> +------------------------------+ 0xef000000

一个PageInfo的大小是8Byte,一个页面的大小是4K。所以可以得到4M的pages数组对应的物理内存大小是:

422084210=2230\frac{4*2^{20}}{8}*4*2^{10} = 2*2^{30}

即操作系统能够支持的物理内存大小不会超过2G,理由如上。

  • How much space overhead is there for managing memory, if we actually had the maximum amount of physical memory? How is this overhead broken down?

如果所有虚拟地址都被映射的话,那么页表的开销,一级页表需要1个page,二级页表需要1024个page。总共需要1025个page。所以页表上的开销为10254=4100KB1025*4=4100KB

采用大页可以减少开销,这样只需要一级页表就可以进行索引,需要一个page也就是4KB就可以了。

  • Revisit the page table setup in kern/entry.S and kern/entrypgdir.c. Immediately after we turn on paging, EIP is still a low number (a little over 1MB). At what point do we transition to running at an EIP above KERNBASE? What makes it possible for us to continue executing at a low EIP between when we enable paging and when we begin running at an EIP above KERNBASE? Why is this transition necessary?

在27行处利用jmp *%eax进行了跳转,完成了在高地址执行的转换。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
.globl entry
entry:
movw $0x1234,0x472 # warm boot

# We haven't set up virtual memory yet, so we're running from
# the physical address the boot loader loaded the kernel at: 1MB
# (plus a few bytes). However, the C code is linked to run at
# KERNBASE+1MB. Hence, we set up a trivial page directory that
# translates virtual addresses [KERNBASE, KERNBASE+4MB) to
# physical addresses [0, 4MB). This 4MB region will be
# sufficient until we set up our real page table in mem_init
# in lab 2.

# Load the physical address of entry_pgdir into cr3. entry_pgdir
# is defined in entrypgdir.c.
movl $(RELOC(entry_pgdir)), %eax
movl %eax, %cr3
# Turn on paging.
movl %cr0, %eax
orl $(CR0_PE|CR0_PG|CR0_WP), %eax
movl %eax, %cr0

# Now paging is enabled, but we're still running at a low EIP
# (why is this okay?). Jump up above KERNBASE before entering
# C code.
mov $relocated, %eax
jmp *%eax
relocated:

# Clear the frame pointer register (EBP)
# so that once we get into debugging C code,
# stack backtraces will be terminated properly.
movl $0x0,%ebp # nuke frame pointer

# Set the stack pointer
movl $(bootstacktop),%esp

# now to C code
call i386_init

同时在低EIP和高EIP访问的原因是,我们将虚拟地址的[0, 4MB)和[KERNBASE, KERNBASE+4MB)都映射到了物理地址的[0, 4MB),所以无论从低地址还是高地址都可以进行访问。

1
2
3
4
5
6
7
8
9
__attribute__((__aligned__(PGSIZE)))
pde_t entry_pgdir[NPDENTRIES] = {
// Map VA's [0, 4MB) to PA's [0, 4MB)
[0]
= ((uintptr_t)entry_pgtable - KERNBASE) + PTE_P,
// Map VA's [KERNBASE, KERNBASE+4MB) to PA's [0, 4MB)
[KERNBASE>>PDXSHIFT]
= ((uintptr_t)entry_pgtable - KERNBASE) + PTE_P + PTE_W
};

从前面的内容可以看到,他在完成分页之后还有在低地址执行的语句,如果不同时将高地址和低地址都映射到物理地址的最低4M的话,那么在低地址运行的代码会出错。

Challenge

可以在inc/mmu.h当中找到关于PTE/PDE flag的描述,具体内容如下:

1
2
3
4
5
6
7
8
9
10
// Page table/directory entry flags.
#define PTE_P 0x001 // Present
#define PTE_W 0x002 // Writeable
#define PTE_U 0x004 // User
#define PTE_PWT 0x008 // Write-Through
#define PTE_PCD 0x010 // Cache-Disable
#define PTE_A 0x020 // Accessed
#define PTE_D 0x040 // Dirty
#define PTE_PS 0x080 // Page Size
#define PTE_G 0x100 // Global

第九行所示的就是PTE_PS位,是用来调整Page Size大小的。

采用大页只有一级页表,对应的地址翻译方式如下:

image-20200310182755807

通过Intel IA32手册3.6.1节关于Page Option的描述可以知道,需要开启cr4里面的PSE标志位,来说明提供对于大页的支持,在mem_init()当中添加如下代码进行实现:

1
2
3
4
// Set CR4_PSE
cr4 = rcr4();
cr4 |= CR4_PSE;
lcr4(cr4);

考虑到lab整体要对于这种大小页混合的方式进行适配的话,需要对于页面相关的许多函数进行重写。所以这里只考虑虚拟地址高256M到物理地址低256M的映射采用大页实现,只对于boot_map_region及其相关函数进行修改。

修改pgdir_walk()函数如下,其中normal状态是针对仅存在4K大小页的情况,而ex表示的是大小页混合状态的情况:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
pte_t *
pgdir_walk_normal(pde_t *pgdir, const void *va, int create)
{
// Fill this function in
uint32_t pdx = PDX(va);
uint32_t ptx = PTX(va);
if(pgdir[pdx] == 0)
{
if(create)
{
struct PageInfo* newpte = page_alloc(1);
if(newpte == NULL)
return NULL;
++(newpte->pp_ref);
pgdir[pdx] = page2pa(newpte) | PTE_P | PTE_W | PTE_U;
}
else
return NULL;
}
physaddr_t pte = PTE_ADDR(pgdir[pdx]) | (ptx << 2);
return KADDR(pte);
}

pte_t *
pgdir_walk_ex(pde_t *pgdir, const void *va, int create)
{
uint32_t pdx = PDX(va);
if(pgdir[pdx] == 0)
{
if(create == 1)
{
struct PageInfo* newpte = page_alloc(1);
if(newpte == NULL)
return NULL;
++(newpte->pp_ref);
pgdir[pdx] = page2pa(newpte) | PTE_P | PTE_W | PTE_U;
}
else if(create == 2)
{
pgdir[pdx] = PTE_PS;
}
else
return NULL;
}
else if(create == 2 && (!(pgdir[pdx] & PTE_PS)))
{
struct PageInfo * pp = pa2page(PTE_ADDR(pgdir[pdx]));
page_decref(pp);
tlb_invalidate(pgdir, (void*)va);
pgdir[pdx] = PTE_PS;
}
uint32_t pde = pgdir[pdx];
if(pde & PTE_PS)
{
return pgdir + pdx;
}
else
{
return KADDR(PTE_ADDR(pgdir[pdx]) | (PTX(va) << 2));
}
}

pte_t *
pgdir_walk(pde_t *pgdir, const void *va, int create)
{
uint32_t size_ex = rcr4() & CR4_PSE;
if(size_ex)
return pgdir_walk_normal(pgdir, va, create);
else
return pgdir_walk_ex(pgdir, va, create);
}

这里仅仅对ex函数进行讨论,首先create可能为0、1或者2,不同于normal函数只存在0、1两种情况。0的时候表示不进行额外的分配。1的情况表示是一个小页,2的情况表示是一个大页,只要非0都是表示若不存在则进行分配。

那么如果当前当做一个大页的话,进行分配的情况不需要再去分配一个页面作为二级页表,只需要标记PTE_PS位返回填充对应的物理地址基址就好了。但是存在一个情况在于,原本这个pde指向的是一个二级页表,但是当前是采用大页进行分配的。所以在45行处有针对这种情况的特判。需要做的是将对应的二级页表的页面给清空,然后当做一个新分配的大页进行返回就可以了。

之后考虑的是boot_map_region()函数,同样重写:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
static void
boot_map_region_normal(pde_t *pgdir, uintptr_t va, size_t size, physaddr_t pa, int perm)
{
// Fill this function in
while(size > 0)
{
pte_t* pte = pgdir_walk(pgdir, (void *)va, 1);
if(pte == NULL)
panic("boot_map_region: Fail to alloc new page, run out of memory!\n");
*pte = pa | perm | PTE_P;
size -= PGSIZE;
va += PGSIZE, pa += PGSIZE;
}
}

static void
boot_map_region_ex(pde_t *pgdir, uintptr_t va, size_t size, physaddr_t pa, int perm)
{
while(size > 0)
{
pte_t* pte = pgdir_walk_ex(pgdir, (void *)va, 2);
if(pte == NULL)
panic("boot_map_region: Fail to alloc new page, run out of memory!\n");
*pte = pa | perm | PTE_P | PTE_PS;
size -= PTSIZE;
va += PTSIZE, pa+= PTSIZE;
}
}

static void
boot_map_region(pde_t *pgdir, uintptr_t va, size_t size, physaddr_t pa, int perm)
{
uint32_t size_ex = rcr4() & CR4_PSE;
if(size_ex)
{
if(ROUNDUP(pa, PTSIZE) < ROUNDDOWN(pa+size, PTSIZE))
{
boot_map_region_normal(pgdir, va, ROUNDUP(pa, PTSIZE) - pa, pa, perm);
boot_map_region_ex(pgdir, va+ROUNDUP(pa, PTSIZE)-pa, ROUNDDOWN(pa+size, PTSIZE) - ROUNDUP(pa, PTSIZE), ROUNDUP(pa, PTSIZE), perm);
boot_map_region_normal(pgdir, va+ROUNDDOWN(pa+size, PTSIZE)-pa, pa+size - ROUNDDOWN(pa+size, PTSIZE), ROUNDDOWN(pa+size, PTSIZE), perm);
}
else
boot_map_region_normal(pgdir, va, size, pa, perm);
}
else
boot_map_region_normal(pgdir, va, size, pa, perm);
}

这里boot_map_region也被分成了两种情况,normal表示的是以4K为一个页面进行映射构造,ex表示的是以4M为一个页面大小进行映射构造。

在36行处的判断说明,仅当cr4被标识成拓展的页面大小且进行映射的区间内存在连续的4M空间的时候,对可以采用大页进行分配的部分采用大页,即调用boot_map_region_ex()函数。所以像是内核栈这种大小只有几十K的映射,在cr4设置之后仍然是采用原有的小页方法进行映射的。

这个时候采用原有的check函数会产生错误,原因在于原有的check函数所进行的地址转换方法是二级的。

重写原来给定的va2pa函数如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
static physaddr_t
check_va2pa(pde_t *pgdir, uintptr_t va)
{
pte_t *p;
pgdir = &pgdir[PDX(va)];
if(!(*pgdir & PTE_P))
return ~0;
if((*pgdir & PTE_PS))
{
return ((*pgdir) & 0xffc00000) | (va & 0x3ff000);
}
else
{
p = (pte_t*) KADDR(PTE_ADDR(*pgdir));
if (!(p[PTX(va)] & PTE_P))
return ~0;
return PTE_ADDR(p[PTX(va)]);
}

}

根据PTE_PS标志位来决定采用一级寻址还是二级寻址,这样就可以得到正常的结果:

1
2
3
4
5
6
7
8
check_page_free_list() succeeded!
check_page_alloc() succeeded!
check_page() succeeded!
check_kern_pgdir() succeeded!
check_page_installed_pgdir() succeeded!
Welcome to the JOS kernel monitor!
Type 'help' for a list of commands.
K>

这里在清空cr4之后都是采用normal的方式来进行映射和寻址,所以会保持和原来相同的行为。

Challenge2

showmappings

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
int 
mon_showmappings(int argc, char **argv, struct Trapframe *tf)
{
if(argc != 3)
{
cprintf("showmappings: should input 3 arguments!\n");
return 0;
}
uint32_t lowerbound = strtol(argv[1], '\0', 16);
uint32_t upperbound = strtol(argv[2], '\0', 16);
uint32_t va;
cprintf("Virtual Address\tPhysical Address\turw\n");
for(va = ROUNDDOWN(lowerbound, PGSIZE); va <= ROUNDUP(upperbound, PGSIZE); va += PGSIZE)
{

pte_t * pte = pgdir_walk(kern_pgdir, (void *)va, 0);
if(pte && ((*pte) & PTE_P))
{
physaddr_t pa = PTE_ADDR(*pte);
char perm_U = ((*pte) & PTE_U) ? 'u' : '-';
char perm_P = ((*pte) & PTE_P) ? 'r' : '-';
char perm_W = ((*pte) & PTE_W) ? 'w' : '-';
cprintf(" 0x%08x\t 0x%08x\t%c%c%c\n" , va, pa, perm_U, perm_P, perm_W);
}
else
cprintf(" 0x%08x\t 0x--------\t---\n", va);
}
return 0;
}

代码如上,可以采用如下形式进行[start_va, end_va]区间内虚拟地址到物理地址页面映射的查询:

1
showmappings <start_va> <end_va>

结果如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
K> showmappings 0xefff0000 0xf0000000
Virtual Address Physical Address urw
0xefff0000 0x-------- ---
0xefff1000 0x-------- ---
0xefff2000 0x-------- ---
0xefff3000 0x-------- ---
0xefff4000 0x-------- ---
0xefff5000 0x-------- ---
0xefff6000 0x-------- ---
0xefff7000 0x-------- ---
0xefff8000 0x00117000 -rw
0xefff9000 0x00118000 -rw
0xefffa000 0x00119000 -rw
0xefffb000 0x0011a000 -rw
0xefffc000 0x0011b000 -rw
0xefffd000 0x0011c000 -rw
0xefffe000 0x0011d000 -rw
0xeffff000 0x0011e000 -rw
0xf0000000 0x00000000 -rw

setperm

提供权限位的设置方法,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
int
mon_setperm(int argc, char **argv, struct Trapframe * tf)
{
if(argc == 3)
{
uint32_t va = strtol(argv[1], '\0', 16);
int perm_U = 0;
int perm_P = 0;
int perm_W = 0;
int i;
for(i = 0; argv[2][i]; ++i)
{
if(argv[2][i] == 'u')
perm_U = 1;
else if(argv[2][i] == 'p')
perm_P = 1;
else if(argv[2][i] == 'w')
perm_W = 1;
}
pte_t * pte = pgdir_walk(kern_pgdir, (void *)va, 0);
if(pte)
{
if(perm_U)
*pte = (*pte) | PTE_U;
if(perm_P)
*pte = (*pte) | PTE_P;
if(perm_W)
*pte = (*pte) | PTE_W;
}
else
{
cprintf("The virtual address 0x%08x is unmapped\n", va);
}
}
else if(argc == 4)
{
uint32_t lowerbound = strtol(argv[1], '\0', 16);
uint32_t upperbound = strtol(argv[2], '\0', 16);
int perm_U = 0;
int perm_P = 0;
int perm_W = 0;
int i, va;
for(i = 0; argv[3][i]; ++i)
{
if(argv[3][i] == 'u')
perm_U = 1;
else if(argv[3][i] == 'p')
perm_P = 1;
else if(argv[3][i] == 'w')
perm_W = 1;
}
for(va = ROUNDDOWN(lowerbound, PGSIZE); va <= ROUNDUP(upperbound, PGSIZE); va += PGSIZE)
{
pte_t * pte = pgdir_walk(kern_pgdir, (void *)va, 0);
if(pte)
{
if(perm_U)
*pte = (*pte) | PTE_U;
if(perm_P)
*pte = (*pte) | PTE_P;
if(perm_W)
*pte = (*pte) | PTE_W;
}
else
{
cprintf("The virtual address 0x%08x is unmapped\n", va);
}
}
}
else
{
cprintf("setperm: should give one address or an address range!\n");
}
return 0;
}

支持输入单个虚拟地址对对应的页表项进行更改,或者对一个虚拟地址区间进行修改:

1
2
setperm <va> <perm>
setperm <start_va> <end_va> <perm>

样例如下:

1
2
3
4
5
6
7
8
9
K> showmappings 0xf0000000 0xf0001000
Virtual Address Physical Address urw
0xf0000000 0x00000000 -rw
0xf0001000 0x00001000 -rw
K> setperm 0xf0000000 u
K> showmappings 0xf0000000 0xf0001000
Virtual Address Physical Address urw
0xf0000000 0x00000000 urw
0xf0001000 0x00001000 -rw

可以发现确实对权限位进行了设置。

clearperm

对权限位进行清空,整体框架和setperm相同,只需要将修改部分的代码改成:

1
2
3
4
5
6
if(perm_U)
*pte = (*pte) & (~PTE_U);
if(perm_P)
*pte = (*pte) & (~PTE_P);
if(perm_W)
*pte = (*pte) & (~PTE_W);

测试结果如下:

1
2
3
4
5
6
7
8
9
K> showmappings 0xf0000000 0xf0001000
Virtual Address Physical Address urw
0xf0000000 0x00000000 -rw
0xf0001000 0x00001000 -rw
K> clearperm 0xf0000000 w
K> showmappings 0xf0000000 0xf0001000
Virtual Address Physical Address urw
0xf0000000 0x00000000 -r-
0xf0001000 0x00001000 -rw

changeperm

对权限位进行修改,整体框架和setperm相同,只需要将修改部分的代码改成:

1
2
3
4
5
6
if(perm_U)
*pte = (*pte) ^ PTE_U;
if(perm_P)
*pte = (*pte) ^ PTE_P;
if(perm_W)
*pte = (*pte) ^ PTE_W;

测试结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
K> showmappings 0xf0000000 0xf0001000
Virtual Address Physical Address urw
0xf0000000 0x00000000 -rw
0xf0001000 0x00001000 -rw
K> changeperm 0xf0000000 w
K> showmappings 0xf0000000 0xf0001000
Virtual Address Physical Address urw
0xf0000000 0x00000000 -r-
0xf0001000 0x00001000 -rw
K> changeperm 0xf0000000 w
K> showmappings 0xf0000000 0xf0001000
Virtual Address Physical Address urw
0xf0000000 0x00000000 -rw
0xf0001000 0x00001000 -rw

content

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
int
mon_content(int argc, char **argv, struct Trapframe *tf)
{
if(argc != 4)
{
cprintf("content: type \"help\" to see the example!\n");
return 0;
}
physaddr_t pa;
if(argv[1][1] == 'v')
{
uint32_t base_va = strtol(argv[2], '\0', 16);
pte_t * pte = pgdir_walk(kern_pgdir, (void *)base_va, 0);
pa = PTE_ADDR(*pte) | PGOFF(base_va);
}
else if(argv[1][1] == 'p')
{
pa = strtol(argv[2], '\0', 16);
}
else
{
cprintf("content: -p means physical address, -v means virtual address!\n");
return 0;
}
uint32_t count = strtol(argv[3], '\0', 10);
int check;
for(check = 0; count > 0; --count, ++check)
{
if(check == 0)
cprintf("0x%08x:", pa);
cprintf(" 0x%08x", *(uint32_t*)(KADDR(pa)));
pa += 4;
if(check == 3)
{
check = -1;
cprintf("\n");
}
}
if(check)
cprintf("\n");
return 0;
}

代码如上,用来查看虚拟地址或者物理地址对应的具体内容,可以利用如下的命令形式进行查询:

1
2
content -p <pa> <number>
content -v <va> <number>

其中利用-p或者-v来表明查询的是物理地址还是虚拟地址,number表示要查询的多少,示例结果如下:

1
2
3
4
5
6
K> content -p 0x0 8
0x00000000: 0xf000ff53 0xf000ff53 0xf000e2c3 0xf000ff53
0x00000010: 0xf000ff53 0xf000ff54 0xf000ff53 0xf000ff53
K> content -v 0xf0000000 8
0x00000000: 0xf000ff53 0xf000ff53 0xf000e2c3 0xf000ff53
0x00000010: 0xf000ff53 0xf000ff54 0xf000ff53 0xf000ff53

之前可以知道,KERNBASE以上的虚拟地址映射到的是从零开始的虚拟地址,所以上面得到的结果是完全相同的。利用qemu的指令进行检查:

1
2
3
(qemu) xp /8x 0x0
0000000000000000: 0xf000ff53 0xf000ff53 0xf000e2c3 0xf000ff53
0000000000000010: 0xf000ff53 0xf000ff54 0xf000ff53 0xf000ff53

可以发现得到的结果完全相同,说明指令运行没有问题。

Challenge3&4

没写代码,感觉两个Challenge是递进的关系。challenge3可以考虑只保存包括内核自身的页目录,以及内核栈地址用来往内核栈写入参数保存信息,中断向量表等陷入内核态需要的信息。这样可能只需要几个page就足够了。陷入内核之后通过内核自身的页目录来完成地址映射进行寻址以及执行。这几个和内核相关的页面都应该是内核可读写,用户没有权限。

之后Challenge4由于Challenge3已经将内核相关的地址空间大小缩小了。如果进程想要对于这些地址进行分配的话,那么由于权限不够,会触发异常。处理的手段就是将这部分内容放到暂时还没有使用的地址,并且对相应的地址链接等内容进行修改,然后再次进行分配操作。我感觉这可能是bouncing kernel的意思,找了很久也没有找到bouncing kernel相关的资料或者论文。由于在challenge3当中把需要内核相关的内容缩小到了几个page,所以就可以大大减少需要触发弹跳机制的频率,降低为了更大地址空间所带来的额外时间开销。

Challenge5

我觉得可以考虑采用类似ICS中malloc lab里面的方法,在PageInfo里面加入前后page的链接以及这个连续页面的大小(是PGSIZE的整数倍)。之后利用first-fit或者best-fit的方式进行适配,删除的时候考虑前后的合并。这样应该就可以完成连续地址的分配。对于比较大的连续分配还可以结合Challenge1当中的大页从而节省掉二级页表的空间。

题面中所说的"power-of-two allocation unit sizes from 4KB up to some reasonable maximum of your choice."应该就是伙伴系统了。感觉要完全实现除去修改自己写的函数之外需要修改check_page_free_list()以及kern/pmap.h当中的宏以及辅助函数,不知道会不会引发什么其他地方未知的错误,没有进行代码实现。

首先利用brew进行安装:

1
brew install libomp

完成之后,采用如下的测试代码,存储为hello.c

1
2
3
4
5
6
#include <omp.h>
#include <stdio.h>
int main() {
#pragma omp parallel
printf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(), omp_get_num_threads());
}

尝试利用gcc进行编译:

1
gcc hello.c -fopenmp -o hello && ./hello

诡异的事情出现了,会得到如下的结果,非常奇怪,gcc应该是支持这个选项的:

1
clang: error: unsupported option '-fopenmp'

从这个地方可以发现clang进行了报错,但是明明是使用gcc进行编译的,利用gcc -v查看可以看到如下发现:

1
2
3
4
Apple clang version 11.0.0 (clang-1100.0.33.17)
Target: x86_64-apple-darwin19.3.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin

这里名叫gcc的东西实际上是clang,所以正常使用gcc编译的时候实际上是用的clang???之后采用gcc-9来指定gcc进行编译,而不采用clang:

1
gcc-9 hello.c -fopenmp -o hello && ./hello 

可以正常的执行并得到如下的结果:

1
2
3
4
5
6
7
8
Hello from thread 1, nthreads 8
Hello from thread 4, nthreads 8
Hello from thread 3, nthreads 8
Hello from thread 6, nthreads 8
Hello from thread 5, nthreads 8
Hello from thread 0, nthreads 8
Hello from thread 2, nthreads 8
Hello from thread 7, nthreads 8

由于多线程并不能保证执行顺序,可以看到Hello的打印顺序是不一样的,到这里就可以愉快使用OpenMP了!

简介

文章提出了一种利用重复一个简单基本块从而聚集一系列有着相同拓扑结构的的转换,这种多分支的结构叫做ResNeXt,相对于ResNet的有着更好地性能。

就像VGG和ResNet一样,都是通过堆叠有着相同拓扑结构的模块这种简单策略,来实现更好的效果。而Inception model不同,是通过一种split-transform-merge的策略,首先split来得到一些低维的embedding,然后过一系列不同的filter来进行transform,最后直接拼接merge在一起,通过这种方式来用更小的算力企图获得更大更深的网络能够带来的表现。

这篇论文中提出了一个同样是重复模块的简单模型,从VGG/ResNet和Inception model都借鉴了策略,将一系列有着相同拓扑结构的transformation给聚集起来了。这种聚集的transformation的多少叫做cardinality。实验证明,当提高网络的深度和宽度得到减少的回报的时候,提升cardinality是一个更有效的提升准确率的方法。

网络结构

这种网络有着三种等价形式:

可以发现最上面一层每一条路径都能够看到全部的数据,最后面一层由于最后对于多条之路要汇总求和,所以也是可以直接做卷积,能够看到全部的数据的。事实上只有中间的卷积操作,对于每一条支路而言,只能看到上一层部分的数据。虽然三者相互等价,但是显然在实现上采用c中描述的形式要简便许多。

以上分析针对三层以上网络,那么对于小于三层的网络而言,两种实现是完全等价的。

那么这里采用新的形式从64变为32x4d的方法只是额外增加了网络宽度。

参数量

image-20200229223535943

以上是几种参数规模差不多的设置,其中C=1C=1的情况代表的就是普通的ResNet,实验结果最好的为C=32C=32,即32×4d32\times4d的模型。对于每一层的参数计算如下:

C(256d+3dd+d256)C \cdot(256*d+3*d*d+d*256)

代码实现

其实基本和ResNet的实现相同,由于pyTorch的卷积层自身有group参数,采用之前提到的三种等价形式的最后一种,只需要在Bottleneck的模块中将中间的卷积层的group设置成32,重新设置Basicblock和Bottleneck的expansion为原来的二分之一,调整channel的大小为原来的两倍,就可以得到ResNeXt了,下面是ResNeXt(32x4D)的一个实现:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
class BasicBlock(nn.Module):
expansion = 0.5

def __init__(self, input_channel, channel, stride):
super(BasicBlock, self).__init__()

output_channel = int(channel * self.expansion)
self.downsample = lambda x: x
if(input_channel != output_channel):
self.downsample = nn.Sequential(
nn.Conv2d(in_channels = input_channel, out_channels = output_channel, kernel_size = 1, stride = stride, bias = False),
nn.BatchNorm2d(output_channel)
)

self.relu = nn.ReLU(inplace = True)

self.convlayers = nn.Sequential(
nn.Conv2d(in_channels = input_channel, out_channels = channel, kernel_size = 3, stride = stride, padding = 1, bias = False),
nn.BatchNorm2d(channel),
nn.ReLU(inplace = True),
nn.Conv2d(in_channels = channel, out_channels = output_channel, kernel_size = 3, stride = 1, padding = 1, bias = False),
nn.BatchNorm2d(output_channel)
)
def forward(self, x):
out = self.downsample(x) + self.convlayers(x)
out = self.relu(out)
return out

class Bottleneck(nn.Module):
expansion = 2

def __init__(self, input_channel, channel, stride, expansion = 2, group_num = 32):
super(Bottleneck, self).__init__()
self.expansion = expansion
output_channel = channel * expansion

self.downsample = lambda x: x
if(input_channel != output_channel):
self.downsample = nn.Sequential(
nn.Conv2d(in_channels = input_channel, out_channels = output_channel, kernel_size = 1, stride = stride, bias = False),
nn.BatchNorm2d(output_channel)
)

self.relu = nn.ReLU(inplace = True)

self.convlayers = nn.Sequential(
nn.Conv2d(in_channels = input_channel, out_channels = channel, kernel_size = 1, stride = 1, bias = False),
nn.BatchNorm2d(channel),
nn.ReLU(inplace = True),
nn.Conv2d(in_channels = channel, out_channels = channel, kernel_size = 3, stride = stride, padding = 1, groups = group_num, bias = False),
nn.BatchNorm2d(channel),
nn.ReLU(inplace = True),
nn.Conv2d(in_channels = channel, out_channels = output_channel, kernel_size = 1, stride = 1, bias = False),
nn.BatchNorm2d(output_channel)
)
def forward(self, x):
out = self.downsample(x) + self.convlayers(x)
out = self.relu(out)
return out

class ResNet(nn.Module):
def __init__(self, block, block_nums, input_channel, class_num = 1000):
super(ResNet, self).__init__()

self.stacklayers = nn.Sequential(
nn.Conv2d(in_channels = input_channel, out_channels = 64, kernel_size = 7, stride = 2, padding = 3, bias = False),
nn.BatchNorm2d(64),
nn.ReLU(inplace = True),
nn.MaxPool2d(kernel_size = 3, stride = 2, padding = 1),
self.make_layers(block = block, input_channel = 64, channel = 128, stride = 1, block_num = block_nums[0]),
self.make_layers(block = block, input_channel = int(128 * block.expansion), channel = 256, stride = 2, block_num = block_nums[1]),
self.make_layers(block = block, input_channel = int(256 * block.expansion), channel = 512, stride = 2, block_num = block_nums[2]),
self.make_layers(block = block, input_channel = int(512 * block.expansion), channel = 1024, stride = 2, block_num = block_nums[3]),
nn.AdaptiveAvgPool2d(1),
nn.Flatten(),
nn.Linear(int(1024*block.expansion), class_num)
)

def make_layers(self, block, input_channel, channel, stride, block_num):
layers = []
layers.append(block(input_channel, channel, stride))
input_channel = int(channel * block.expansion)
for _ in range(1, block_num):
layers.append(block(input_channel, channel, 1))
return nn.Sequential(*layers)

def forward(self, x):
out = self.stacklayers(x)
return out

def ResNeXt_18(input_channel, class_num):
return ResNet(BasicBlock, [2,2,2,2], input_channel, class_num)

def ResNeXt_34(input_channel, class_num):
return ResNet(BasicBlock, [3,4,6,3], input_channel, class_num)

def ResNeXt_50(input_channel, class_num):
return ResNet(Bottleneck, [3,4,6,3], input_channel, class_num)

def ResNeXt_101(input_channel, class_num):
return ResNet(Bottleneck, [3,4,23,3], input_channel, class_num)

def ResNeXt_152(input_channel, class_num):
return ResNet(Bottleneck, [3,8,36,3], input_channel, class_num)

Environment Setting

在OS X下进行的环境搭建,配置如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
             ###                  User: pims
#### Hostname: PimsdeMacBook-Pro
### Distro: OS X 10.15.3
####### ####### Kernel: Darwin
###################### Uptime: 7:32
##################### Shell: /bin/zsh
#################### Terminal: xterm-256color iTerm.app
#################### CPU: Intel Core i5-8257U CPU @ 1.40GHz
##################### Memory: 16 GB
###################### Disk: 26%
#################### Battery: 100%
################
#### #####

由于官网提供的补丁版qemu在本地报错make不成功,所以用的是正常版的qemu,但是对于在完成exercise的过程中没有遇到很大的问题,对于lab文件的make结果如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
➜  lab git:(lab1) ✗ make
+ as kern/entry.S
+ cc kern/entrypgdir.c
+ cc kern/init.c
+ cc kern/console.c
+ cc kern/monitor.c
+ cc kern/printf.c
+ cc kern/kdebug.c
+ cc lib/printfmt.c
+ cc lib/readline.c
+ cc lib/string.c
+ ld obj/kern/kernel
i386-jos-elf-ld: warning: section `.bss' type changed to PROGBITS
+ as boot/boot.S
+ cc -Os boot/main.c
+ ld boot/boot
boot block is 382 bytes (max 510)
+ mk obj/kern/kernel.img

Exercise1

内容为阅读汇编的文档,进行了阅读,了解了内嵌汇编的语法格式。

Exercise2

逐步执行查看了运行过程,并且对于GDB指令进行了进一步的熟悉。

Exercise3

  • At what point does the processor start executing 32-bit code? What exactly causes the switch from 16- to 32-bit mode?
1
2
3
4
5
6
7
8
lgdt    gdtdesc
movl %cr0, %eax
orl $CR0_PE_ON, %eax
movl %eax, %cr0

# Jump to next instruction, but in 32-bit code segment.
# Switches processor into 32-bit mode.
ljmp $PROT_MODE_CSEG, $protcseg

在修改完了cr0的值之后,通过ljmp指令切换到32-bit模式

  • What is the last instruction of the boot loader executed, and what is the first instruction of the kernel it just loaded?

在main.c文件中可以看到,bootmain结尾的最后是:

1
((void (*)(void)) (ELFHDR->e_entry))();

转到ELF头里面的入口,这个函数正常情况下不会返回,所以后面bad里面的死循环在正常情况下是永远不可能执行的代码。

在obj/boot/boot.asm里面可以看到对应的内容为:

1
7d63:	ff 15 18 00 01 00    	call   *0x10018
  • Where is the first instruction of the kernel?

在gdb窗口中b *0x7d63call语句前面打一个断点,之后执行si,可以看到kernel里面的第一条语句是

1
0x10000c:	movw   $0x1234,0x472
  • How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?

在ELF头里面,保存了相关的信息,其中Elf数据类型的定义inc/elf.h头文件当中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
struct Elf {
uint32_t e_magic; // must equal ELF_MAGIC
uint8_t e_elf[12];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
uint32_t e_entry;
uint32_t e_phoff;
uint32_t e_shoff;
uint32_t e_flags;
uint16_t e_ehsize;
uint16_t e_phentsize;
uint16_t e_phnum;
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
};

其中e_phoff表示Program header table在文件中的偏移量,e_phnum表示Program header table里面一共有多少个条目,在bootmain的主函数中从ELF头读入得到:

1
2
ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
eph = ph + ELFHDR->e_phnum;

Exercise4

pointers.c的文件具体内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#include <stdio.h>
#include <stdlib.h>

void
f(void)
{
int a[4];
int *b = malloc(16);
int *c;
int i;

printf("1: a = %p, b = %p, c = %p\n", a, b, c);

c = a;
for (i = 0; i < 4; i++)
a[i] = 100 + i;
c[0] = 200;
printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);

c[1] = 300;
*(c + 2) = 301;
3[c] = 302;
printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);

c = c + 1;
*c = 400;
printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);

c = (int *) ((char *) c + 1);
*c = 500;
printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
a[0], a[1], a[2], a[3]);

b = (int *) a + 1;
c = (int *) ((char *) a + 1);
printf("6: a = %p, b = %p, c = %p\n", a, b, c);
}

int
main(int ac, char **av)
{
f();
return 0;
}

得到的输出结果为:

1
2
3
4
5
6
1: a = 0x7ffeef413940, b = 0x7ffe70c00060, c = 0x7ffeef4139a0
2: a[0] = 200, a[1] = 101, a[2] = 102, a[3] = 103
3: a[0] = 200, a[1] = 300, a[2] = 301, a[3] = 302
4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302
5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302
6: a = 0x7ffeef413940, b = 0x7ffeef413944, c = 0x7ffeef413941

可以看到b指向的是在堆上面开辟的空间,而a、c都是在栈上面开辟的空间,所以地址存在一定差异。之后都是一些比较简单的地址索引以及指针加法。

Exercise5

这里将boot/Makefrag文件中的0x7C00修改成了0x7D00:

1
2
3
4
5
6
$(OBJDIR)/boot/boot: $(BOOT_OBJS)
@echo + ld boot/boot
$(V)$(LD) $(LDFLAGS) -N -e start -Ttext 0x7D00 -o $@.out $^
$(V)$(OBJDUMP) -S $@.out >$@.asm
$(V)$(OBJCOPY) -S -O binary -j .text $@.out $@
$(V)perl boot/sign.pl $(OBJDIR)/boot/boot

重新make之后,查看obj/boot/boot.asm可以发现链接地址已经发生了改变:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
.globl start
start:
.code16 # Assemble for 16-bit mode
cli # Disable interrupts
7d00: fa cli
cld # String operations increment
7d01: fc cld

# Set up the important data segment registers (DS, ES, SS).
xorw %ax,%ax # Segment number zero
7d02: 31 c0 xor %eax,%eax
movw %ax,%ds # -> Data Segment
7d04: 8e d8 mov %eax,%ds
movw %ax,%es # -> Extra Segment
7d06: 8e c0 mov %eax,%es
movw %ax,%ss # -> Stack Segment
7d08: 8e d0 mov %eax,%ss

但是执行GDB可以看到,事实上BIOS依然将boot loader加载到了0x7c00的位置,也就是说程序在执行到这里的时候,前面依然有一部分是可以正常执行的:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Breakpoint 1, 0x00007c00 in ?? ()
(gdb) x/40i
0x7c01: cld
0x7c02: xor %eax,%eax
0x7c04: mov %eax,%ds
0x7c06: mov %eax,%es
0x7c08: mov %eax,%ss
0x7c0a: in $0x64,%al
0x7c0c: test $0x2,%al
0x7c0e: jne 0x7c0a
0x7c10: mov $0xd1,%al
0x7c12: out %al,$0x64
0x7c14: in $0x64,%al
0x7c16: test $0x2,%al
0x7c18: jne 0x7c14
0x7c1a: mov $0xdf,%al
0x7c1c: out %al,$0x60
0x7c1e: lgdtl (%esi)
0x7c21: fs jge 0x7c33
0x7c24: and %al,%al
0x7c26: or $0x1,%ax
0x7c2a: mov %eax,%cr0
0x7c2d: ljmp $0xb866,$0x87d32
0x7c34: adc %al,(%eax)
0x7c36: mov %eax,%ds
0x7c38: mov %eax,%es
0x7c3a: mov %eax,%fs
0x7c3c: mov %eax,%gs
0x7c3e: mov %eax,%ss
0x7c40: mov $0x7d00,%esp
0x7c45: call 0x7d0b
0x7c4a: jmp 0x7c4a
0x7c4c: add %al,(%eax)
0x7c4e: add %al,(%eax)
0x7c50: add %al,(%eax)
0x7c52: add %al,(%eax)
0x7c54: (bad)
0x7c55: incl (%eax)
0x7c57: add %al,(%eax)
0x7c59: lcall $0x0,$0xffff00cf
0x7c60: add %dl,0x1700cf(%edx)
0x7c66: dec %esp

但是与之前相比而言,对于ljmp指令发生了改变,这里附上之前的内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Breakpoint 1, 0x00007c00 in ?? ()
(gdb) x/40i
0x7c01: cld
0x7c02: xor %eax,%eax
0x7c04: mov %eax,%ds
0x7c06: mov %eax,%es
0x7c08: mov %eax,%ss
0x7c0a: in $0x64,%al
0x7c0c: test $0x2,%al
0x7c0e: jne 0x7c0a
0x7c10: mov $0xd1,%al
0x7c12: out %al,$0x64
0x7c14: in $0x64,%al
0x7c16: test $0x2,%al
0x7c18: jne 0x7c14
0x7c1a: mov $0xdf,%al
0x7c1c: out %al,$0x60
0x7c1e: lgdtl (%esi)
0x7c21: fs jl 0x7c33
0x7c24: and %al,%al
0x7c26: or $0x1,%ax
0x7c2a: mov %eax,%cr0
0x7c2d: ljmp $0xb866,$0x87c32
0x7c34: adc %al,(%eax)
0x7c36: mov %eax,%ds
0x7c38: mov %eax,%es
0x7c3a: mov %eax,%fs
0x7c3c: mov %eax,%gs
0x7c3e: mov %eax,%ss
0x7c40: mov $0x7c00,%esp
0x7c45: call 0x7d0b
0x7c4a: jmp 0x7c4a
0x7c4c: add %al,(%eax)
0x7c4e: add %al,(%eax)
0x7c50: add %al,(%eax)
0x7c52: add %al,(%eax)
0x7c54: (bad)
0x7c55: incl (%eax)
0x7c57: add %al,(%eax)
0x7c59: lcall $0x0,$0xffff00cf
0x7c60: add %dl,0x1700cf(%edx)
0x7c66: dec %esp

可以看到由$0x87c32变成了$0x87d32,在执行完了ljmp指令之后,程序就会出错了。

Exercise6

在0x7c00处添加断点,查看0x100000地址存放的内容,可以发现是全0,这就是BIOS在进入boot loader的时候,对应的内容。

1
2
0x100000:	0x00000000	0x00000000	0x00000000	0x00000000
0x100010: 0x00000000 0x00000000 0x00000000 0x00000000

之后再0x7d63处添加断点,此时是boot loader要进入内核的时点,0x100000存放的内容如下:

1
2
0x100000:	0x1badb002	0x00000000	0xe4524ffe	0x7205c766
0x100010: 0x34000004 0x7000b812 0x220f0011 0xc0200fd8

可以发现已经发生了改变,并不是一开始的全0,说明boot loader进行了一个载入内核的工作。

Exercise7

从entry.S中,可以看到mov %eax,%cr0位于入口的开头处,利用gdb在入口处设置断点,逐条执行可以发现,这条指令位于0x100025,在此处设置断点:

1
2
3
4
5
6
(gdb) x/8x 0x00100000
0x100000: 0x1badb002 0x00000000 0xe4524ffe 0x7205c766
0x100010: 0x34000004 0x7000b812 0x220f0011 0xc0200fd8
(gdb) x/8x 0xf0100000
0xf0100000 <_start-268435468>: 0x00000000 0x00000000 0x00000000 0x00000000
0xf0100010 <entry+4>: 0x00000000 0x00000000 0x00000000 0x00000000

执行完这条指令之后:

1
2
3
4
5
6
(gdb) x/8x 0x00100000
0x100000: 0x1badb002 0x00000000 0xe4524ffe 0x7205c766
0x100010: 0x34000004 0x7000b812 0x220f0011 0xc0200fd8
(gdb) x/8x 0xf0100000
0xf0100000 <_start-268435468>: 0x1badb002 0x00000000 0xe4524ffe 0x7205c766
0xf0100010 <entry+4>: 0x34000004 0x7000b812 0x220f0011 0xc0200fd8

可以发现,在执行这条指令之前,0xf0100000处是全0的,在执行之后,有了和0x00100000处一样的值。设置了%cr0后启用分页,让0xf0100000和0x00100000映射到了同样的物理地址,所以查看会有相同的值。

在entry.S中可以看到,之后他尝试执行的指令是:

1
2
mov	$relocated, %eax
jmp *%eax

他要在执行C代码之前,跳转到KERNBASE上方,不再在低地址了。从gdb可以发现,这里移入%eax的值为0xf010002f,如果没有启用分页,那么跳转将会失败。

可以发现当注释掉那一行之后,会导致内核崩溃,卡在Booting from Hard Disk..,同时利用GDB查看也可以看到,跳转进入的0xf010002c位置为全0。

Exercise8

缺失的内容定义在printfmt.c中第206行,直接仿照上面的%u进行修改,将base改成8就可以了。

1
2
3
4
5
6
// (unsigned) octal
case 'o':
// Replace this with your code.
num = getuint(&ap, lflag);
base = 8;
goto number;

可以发现,修改完之后,命令行中的"6828 decimal is 15254 octal!"已经可以正确显示了。

  1. Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?

console.c里面cputchar()函数在printf.c里面被使用了。它的作用是往屏幕上打印一个字符,被用在printf.c里面的putch()函数中,之后作为参数传入vprintfmt()的调用过程。

  1. Explain the following from console.c:
1
2
3
4
5
6
7
if (crt_pos >= CRT_SIZE) {
int i;
memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
crt_buf[i] = 0x0700 | ' ';
crt_pos -= CRT_COLS;
}

CRT_SIZE指的应该是crt_buff里面显示缓冲区的大小,所以这里的情况实际是超过缓冲区最大上限的时候的处理方法。memmove()函数的定义如下所示,其作用是将str2处复制n个字符到str1处,在有重叠区域的情况下比memcpy()更加安全。

1
void *memmove(void *str1, const void *str2, size_t n)

所以这里所进行的内容是将缓冲区的内容整体前移了CRT_COLS字符,腾出了一部分的缓冲区空间。

  1. For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC’s calling convention on the x86.
    Trace the execution of the following code step-by-step:

    1
    2
    int x = 1, y = 3, z = 4;
    cprintf("x %d, y %x, z %d\n", x, y, z);
    • In the call to cprintf(), to what does fmt point? To what does ap point?
    • List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.

在对于cprintf()的调用当中,fmt指的是格式化的字符串,ap指向的是参数列表。

其中cons_putc()的内容如下:

1
2
3
4
5
6
7
8
// output a character to the console
static void
cons_putc(int c)
{
serial_putc(c);
lpt_putc(c);
cga_putc(c);
}

他的作用是向console输出一个字符,主要的操作位于cga_putc()内容当中,对于输入来确定字符,然后根据情况进行输出。

vcprintf()内容如下:

1
2
3
4
5
6
7
8
int
vcprintf(const char *fmt, va_list ap)
{
int cnt = 0;

vprintfmt((void*)putch, &cnt, fmt, ap);
return cnt;
}

传入格式化的字符串以及对应的参数列表,然后通过调用vprintfmt()进行输出。

va_arg()的调用在vprintfmt()当中出现,例如:

1
2
3
case '*':
precision = va_arg(ap, int);
goto process_precision;

他实现的内容实际上是从ap里面读取一个参数,然后将ap进行一个修改,即通过后面提供的数据类型来进行指针的移动。

这三个函数的关系是vcprintf()中调用了vprintfmt(),在vprintfmt()内部利用va_arg()对格式化字符串中的参数进行解析,之后得到确切的字符串利用cons_putc()函数一个一个字符的向console进行输出。

  1. Run the following code.
1
2
unsigned int i = 0x00646c72;
cprintf("H%x Wo%s", 57616, &i);

What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise.

The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

得到的输出为:“He110 World”。

可以知道57616转换成16进制的结果为e110,所以前半部分得到的是He110。

后半部分对应ASCII码表可以知道:

由于小端法存储,0x00646c72的存储实际上是

1
72 6c 64 00

对应的字符串就是"rld\0",得到的后半部分为World。

如果改为大端法,对前半部分不会有影响,后半部分需要改成i = 0x726c6400

  1. In the following code, what is going to be printed after ‘y=’? (note: the answer is not a specific value.) Why does this happen?
1
cprintf("x=%d y=%d", 3);

执行得到的结果为"x=3 y=-267288596",因为此处y所对应的%d没有给出,那么他会尝试在栈上读取内容。通过gdb调试可以知道最后传入时候ap = f0117fd4

1
2
3
(gdb) x/8x 0xf0117fd4
0xf0117fd4: 0x00000003 0xf0117fec 0x00000000 0x00000000
0xf0117fe4: 0x00000000 0x00000000 0x00646c72 0x00000000

查看地址所对应的内容,可以发现所打印出来的y其实就是后面的0xf0117fec转换成int的值。

  1. Let’s say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

需要能够从栈顶知道一共有多少参数才能规范后面的行为,通过调整参数顺序,把fmt字符串当做最后一个参数输入,或者添加一个新参数为参数的总个数n放在末尾都可以。

Exercise9

在kern/entry.S的末尾可以看到如下代码,在.data段里面为栈预留了KSTKSIZE大小的空间。

1
2
3
4
5
6
7
8
9
10
.data
###################################################################
# boot stack
###################################################################
.p2align PGSHIFT # force page alignment
.globl bootstack
bootstack:
.space KSTKSIZE
.globl bootstacktop
bootstacktop:

在obj/kern/kernel.asm中的第56-58行,通过设置%esp来初始化栈的位置。

1
2
3
	# Set the stack pointer
movl $(bootstacktop),%esp
f0100034: bc 00 80 11 f0 mov $0xf0118000,%esp

栈底的位置就是0xf0118000,从高地址往低地址生长。

同时在kern/entry.S的69行处可以看到:

1
2
3
4
5
6
relocated:

# Clear the frame pointer register (EBP)
# so that once we get into debugging C code,
# stack backtraces will be terminated properly.
movl $0x0,%ebp # nuke frame pointer

将栈的终点设置为0,这能够使得后面进行的backtrace可以正常终止,不会陷入死循环或者出错访问到栈之外的空间。

Exercise10

流程就是不断进行函数的调用,输入的参数每一次都-1,从一开始的5到最后的1然后到达递归终点。

obj/kern/kernel.asm中的对应内容如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
void
test_backtrace(int x)
{
f0100040: 55 push %ebp
f0100041: 89 e5 mov %esp,%ebp
f0100043: 53 push %ebx
f0100044: 83 ec 14 sub $0x14,%esp
f0100047: 8b 5d 08 mov 0x8(%ebp),%ebx
cprintf("entering test_backtrace %d\n", x);
f010004a: 89 5c 24 04 mov %ebx,0x4(%esp)
f010004e: c7 04 24 80 18 10 f0 movl $0xf0101880,(%esp)
f0100055: e8 34 09 00 00 call f010098e <cprintf>
if (x > 0)
f010005a: 85 db test %ebx,%ebx
f010005c: 7e 0d jle f010006b <test_backtrace+0x2b>
test_backtrace(x-1);
f010005e: 8d 43 ff lea -0x1(%ebx),%eax
f0100061: 89 04 24 mov %eax,(%esp)
f0100064: e8 d7 ff ff ff call f0100040 <test_backtrace>
f0100069: eb 1c jmp f0100087 <test_backtrace+0x47>
else
mon_backtrace(0, 0, 0);
f010006b: c7 44 24 08 00 00 00 movl $0x0,0x8(%esp)
f0100072: 00
f0100073: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp)
f010007a: 00
f010007b: c7 04 24 00 00 00 00 movl $0x0,(%esp)
f0100082: e8 cb 06 00 00 call f0100752 <mon_backtrace>
cprintf("leaving test_backtrace %d\n", x);
f0100087: 89 5c 24 04 mov %ebx,0x4(%esp)
f010008b: c7 04 24 9c 18 10 f0 movl $0xf010189c,(%esp)
f0100092: e8 f7 08 00 00 call f010098e <cprintf>
}
f0100097: 83 c4 14 add $0x14,%esp
f010009a: 5b pop %ebx
f010009b: 5d pop %ebp
f010009c: c3 ret

通过:

1
f0100044:	83 ec 14             	sub    $0x14,%esp

可以看到,每次栈向下生长0x14,并且每一次函数调用都会传入参数,同时保存%ebp%ebx的值,将其压入栈中。每次栈向下生长0x20。折算成32-bit字的话应当是8个。

Exercise11

补全的mon_backtrace()如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
cprintf("Stack backtrace:\n");
uint32_t* ebp = (uint32_t*)read_ebp();
while(ebp){
cprintf("ebp %08x ",ebp);
cprintf("eip %08x ",*(ebp+1));
cprintf("args");
int i;
for(i=2;i<=6;++i)
cprintf(" %08x",*(ebp+i));
cprintf("\n");
ebp = (uint32_t*)(*ebp);
}
return 0;
}

执行结果如下:

image-20200223161749177

可以看到最顶上的是mon_backtrace()函数,下面是五次的test_backtrace()调用,符合题目要求。

Exercise12

debuginfo_eip()中利用stab_binsearch()函数来查找行号,通过观察inc/stab.h中的宏定义可以发现对应的类型应当是N_SLINE,对于搜索得到的结果,将行号从stabs数组中提取填写到info里面。如果lline>rline说明出现了错误,直接返回-1。

1
2
3
4
5
stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
if(lline <= rline)
info->eip_line = stabs[lline].n_desc;
else
return -1;

之后修改mon_backtrace()函数内部如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
cprintf("Stack backtrace:\n");
uint32_t* ebp = (uint32_t*)read_ebp();
struct Eipdebuginfo info;
while(ebp){
cprintf("ebp %08x ",ebp);
cprintf("eip %08x ",ebp[1]);
cprintf("args");
int i;
for(i=2;i<=6;++i)
cprintf(" %08x",ebp[i]);
cprintf("\n");

debuginfo_eip(ebp[1], &info);
cprintf("\t%s:%d: %.*s+%d\n", info.eip_file, info.eip_line, info.eip_fn_namelen, info.eip_fn_name, ebp[1]-info.eip_fn_addr);

ebp = (uint32_t*)(*ebp);
}
return 0;
}

在kern/monitor.c文件中加入新的命令行指令,这样当输入backtrace的时候就会调用mon_backtrace()函数。

1
2
3
4
5
static struct Command commands[] = {
{ "help", "Display this list of commands", mon_help },
{ "kerninfo", "Display information about the kernel", mon_kerninfo },
{ "backtrace", "Findout the the value of \%ebp, \%eip and the args of called functions", mon_backtrace},
};

执行结果如下:

image-20200223195603628

可见该指令可以成功被调用。

在本地使用make grade评测可以得到满分50分:

image-20200223195701827

Challenge

通过WIKI百科中对于ANSI escape code的描述可以知道,利用\e[可以开启一个控制序列,那么只需要调整在输出字符串的开头控制颜色,在结尾恢复就可以,例如以下代码就会将输出文本调整成红色:

1
\e[31m<output string>\e[0m

mon_backtrace()中打印行号等部分修改如下:

1
cprintf("\t\e[92m%s\e[0m:\e[31m%d\e[0m: \e[36m%.*s+%d\e[0m\n", info.eip_file, info.eip_line, info.eip_fn_namelen, info.eip_fn_name, ebp[1]-info.eip_fn_addr);

可以看到产生图中所示的彩色输出:

image-20200223195925655

某量化的机器学习岗笔试题,回忆如下,虽然我不知道这笔试和金融/机器学习有什么关系。考试时长四个小时,一共七道题,最后两道编程题选一题做就可以,所以平均下来是每题四十分钟。

一、最大回撤

题目背景:考虑一个股票的价格在NN天内的价格为P1,P2,,PNP_1,P_2,\ldots,P_N,一个投资者比较厌恶风险,他所能接受的最大回撤不能超过DD,试着计算他在这个最大回撤限制下的最大收益,算法复杂度需要是O(N)O(N)的级别。最大回撤定义为max(PiPj),i<j\max(P_i-P_j),i<j,其中i,ji,j都在买卖区间内。

我的解答:双指针+单调队列维护区间最大值

双指针靠左的指向买入日,靠右的指向卖出日,利用单调队列维护买卖区间中的最大值。每次当卖出日右移的时候检查是否满足最大回撤的约束,如果是就尝试更新最大收益,否则的话就左移买入指针,并尝试更新最大收益。卖出指针移动到末端的时候,仍然要依次左移买入指针直到末端,并尝试更新最大收益。由于两个指针都要遍历数组,容易知道复杂度是O(n)O(n)的。

二、概率题——打乒乓球

题目背景:ABC三个人喜欢打乒乓球,但是一张球桌只能有两个人,所以有如下规则:每次两人对打,输家下场,换场下的人上场和赢家对打。那么三个人都非常好强,想要赢下另外两个人至少一局才算打爽了,(如A要打得爽,那么A需要赢过B也要赢过C),且只有所有人都打爽了球局才会结束,如果任意两个人的对局,都是五五开的,那么考虑以下两种情况:

  1. 如果第一局AB对阵,B胜利,第二轮BC对阵,C胜利,第三轮AC对阵,A胜利,第四轮AB对阵,A胜利。从这之后到所有人都打爽了,需要打的局数的期望是多少?
  2. 如果此时三个人刚刚来打球,那么要所有人都打爽了,需要打的局数的期望是多少?

我的解答:

  1. 考虑如下的状态转换:

    一共有一下六个状态abcdefabcdef,他们对应状态要打的局数期望有以下关系:

    {a=1+(c+d)/2b=1+(a+e)/2c=1+(a+b)/2d=1+e/2e=1+(d+f)/2f=1+(d+e)/2\begin{cases} a = 1 + (c+d)/2 \\ b = 1 + (a+e)/2 \\ c = 1 + (a+b)/2 \\ d = 1 + e/2 \\ e = 1+(d+f)/2 \\ f = 1 + (d+e)/2 \end{cases}

    可以解得:

    {a=36/5b=38/5c=42/5d=4e=6f=6\begin{cases} a = 36/5 \\ b = 38/5 \\ c = 42/5 \\ d = 4 \\ e = 6 \\ f = 6 \end{cases}

    所以期望应当是36/536/5轮次

  2. 我确实不会(菜狗哭泣),用它给的做编程题的窗口写了个代码跑模拟,得到的结果是62/562/5的期望。

三、报数

题目背景:一共有2019个人依次编号,首尾相连占成一个圈,教练让从1号开始报数,依次报121212,每次报到2的人出局,由于是一个圈,所以可以一直循环下去。

  1. 问最后留下来的那个人的编号是多少?
  2. 小明拿到了1001号,但是他爸是教练,可以选择在任意时候给他爸一个眼神交流,让他爸讲报数顺序逆转,问小明有没有可能留到最后?

我的解答:

  1. 为了叙述方便首先定义轮次,从第1个人跑到最后算一个轮次,那么可以知道从第一轮留下来的都是奇数,相邻的人差的都是2,第二轮留下来的差的都是4,依次类推。所以容易发现每一轮留下来的人,在二进制表示上,有一位是相同的
    2019的二进制表示为:

111 1110 0011 111\ 1110\ 0011

将其左移一位,末尾补1,找到在1~2019范围内的值就是所求的值,对应的二进制表示为

111 1100 0111 111\ 1100\ 0111

即留下来的是1991号。

  1. 有了前面的推断方法,我们可以判断在一共nn个人的队列中,处于第kk个的时候是不是最后留下来的,考虑1001的二进制表示为:

011 1110 1001 011\ 1110\ 1001

在倒数第二位的二进制表示出现不同,那么可以知道如果他爸不改顺序的话,他计划是在第二轮就被淘汰的。
这里进行讨论,首先看在第一轮的情况下,如果他爸在报数还没报到他的情况下就反转了,能不能让他留到最后。假设在他前面已经有nn个人出局了,那么就还剩下了2019n2019-n个人,由于在他之前,所以可以知道n<=500n<=500,必定有2019n>=1519>10242019-n>=1519>1024,所以二进制表示保持有11位。
他爸有两种方法,一个是在第2n2n号出局的时候,立刻反转由2n12n-1号开始报1,他在队列中排1019+n1019+n位,另一种是在2n+12n+1号报了1之后在进行反转,他在队列中排1019+n+11019+n+1位。
为了使得他能够留到最后,那么要满足以上情况:

(2019n1024)2+1=1019+n(+1) (2019-n-1024)*2+1 = 1019+n(+1)

此处的(+1)(+1)表示两种情况,整理可以得到:

972=3n(+1) 972= 3n(+1)

可以发现,当不带(+1)(+1)的时候,nn是存在解的,即n=324n=324,所以在628号淘汰出局的时候,他爸立刻转向,让627号报1,小明就可以成为留到最后的人。

四、桥牌

问题背景:

  1. 一共除去大小王有52张牌,每人13张牌,大小次序为AKQJT98765432。
  2. 桥牌分为东南西北四家,其中南北为一队,东西为一队。按西北东南顺序出牌。
  3. 每一轮要根据第一轮出牌的花色来出牌,没有的话只能出其他花色的垫牌,垫牌必定小。这一轮最大的下一轮首先出牌。
  4. 西家先出牌,南北获胜的方法是赢下所有十三轮,东西获胜只要赢下一轮就可以。

问当四人都明牌的情况下,南北方的必胜策略是什么?

  1. 西家先出红心T
西
黑桃 J876 A5432 KQT9
红心 [T]98 AKQ 7654 J32
方片 876 J2 T9543 AKQ
梅花 654 A32 T987 KQJ
  1. 西家先出红心T
西
黑桃 76 AKQ5 JT98 432
红心 [T]987 KQJ 65 A432
方片 T98 5432 76 AKQJ
梅花 7654 A2 KJT98 Q3
  1. 西家先出红心6
西
黑桃 J9876 A54 KT Q32
红心 [6]5 432 KQJ987 AT
方片 7 AKQ6543 JT98 2
梅花 65432 T AKQJ987

我的解答:真是绝了,这道题杀我,太久没打牌而陷入思维陷阱,笔试就只做出了第一个。

  1. 红心方片梅花北南都是绝对大,所以直接到下面这种情况,由于之前都是绝对大,此时北南可以控制由哪一边出牌
西
黑桃 J8 A5 T9

此时南家出9,如果西家出J,北家出A大。如果西家出8,那么北家出5,让南家大。

  1. 南北两家出红心方片在可以绝对大八轮,在这中间北家垫牌梅花2,出梅花A大一轮,这个时候东家要垫四张牌,此时都是南家大。
东(垫4张)
黑桃 AKQ5 JT98 432
梅花 KJT9 Q

如果东家垫的全部都是梅花,那么南家出梅花Q,北家垫掉黑桃5,之后北家三轮大。

如果东家垫过黑桃,那么北家黑桃AKQ5四轮都大。

  1. 第一轮打完东家出J逼南家出A大,之后北家有方片AKQ三张绝对大,之后难上手,一定会打掉梅花七张牌。由于西家黑桃J大不过北家的A和南家的Q,所以这里略去西家。
北(垫7张) 东(垫6张)
黑桃 A54 KT Q32
红心 43 KQ987 T
方片 AKQ6543 JT98 2

首先可以确定北家必定上手,那么东家不会留超过两张红心,红心789首先会被垫干净。

北(垫7张) 东(垫3张)
黑桃 A54 KT Q32
红心 43 KQ T
方片 AKQ6543 JT98 2

之后北家角度来看,东家必然垫完了987之后,北家和南家都无法打过东家的牌,必然不会再打红心,所以北家的两张红心也会被垫干净

北(垫5张) 东(垫3张)
黑桃 A54 KT Q32
红心 KQ T
方片 AKQ6543 JT98 2

那么北家不会再打红心,东家继续拿着两张红心是不合理的,但是南家还有一张红心,所以红心Q也会被垫掉

北(垫5张) 东(垫2张)
黑桃 A54 KT Q32
红心 K T
方片 AKQ6543 JT98 2

这个时候,由于轮次的问题,北家必须要垫掉四张牌,然后才轮到东家决策,北家垫掉方片3456

北(垫1张) 东(垫2张)
黑桃 A54 KT Q32
红心 K T
方片 AKQ JT98 2

此时东家必定垫掉方片8,之后北家垫掉黑桃4

东(垫1张)
黑桃 A5 KT Q32
红心 K T
方片 AKQ JT9 2

东家肯定不会再动方片,那么最后一张垫牌就在红心K和黑桃T之间决策。

如果东家垫了黑桃T,那么南家走一张黑桃2,牌权交给北家,北家打完之后打掉方片三张,之后用一张黑桃5让南家黑桃Q大,游戏结束。

如果东家垫了红心K,那么南家走一张红心T可以大一轮,北家垫掉黑桃5,之后北家四张牌都是绝对大,游戏结束。

五、个人项目简答

就一个有关自己曾经做过项目的简答题,要求简明扼要,所占的分并不多。

六、算法编程——信封嵌套

问题背景:给定NN个信封,每个信封由两个数w,hw,h描述,表示信封的宽和高,如果一个信封的宽和高分别小于另外一个信封,那么就可以放入另一个信封,每个信封都可以进行90°的旋转,所以w,hw,h是可以互换的。

  1. 求出嵌套层数最多的方案。
  2. 如果不是信封而是三维盒子,求出嵌套最多的方案。

我的解答:在我的算法中,信封和盒子都没有任何区别,所以这里统一考虑。

容易写出针对于两个信封比较的函数,即一个信封能否装下另一个信封。对每两个信封都做一个对比,如果AA信封能够装下BB信封,那么建立一条ABA \rightarrow B的连边。这样就构成了一个有向无环图,问题转变成了求图上的最长路径。

采用图上动态规划的方法来进行求解,每个节点上的val表示这个节点的信封作为最外层,所能够嵌套的层数,如果这个节点没有出边,那么将其val设置为1。否则的话,将他的val设置为子节点最大的val再加上1。遍历完整张图之后得到的最大的val值就是最大的嵌套层数。然后从有最大val值的节点进行一个DFS,就可以找到嵌套层数最多的方案。

建图的时间复杂度为O(N2)O(N^2),进行遍历的时间复杂度为O(N)O(N),所以总的时间复杂度为O(n2)O(n^2)。可能有O(NlogN)O(N\log N)的解法?

Update:首先将两个数值进行排序,小的为高,大的为宽。之后对于一个排序,另一个利用树状数组进行维护,查看有多少高宽都小于它的信封就可以了,时间复杂度O(NlogN)O(N\log N)

七、算法编程——螺旋数组

问题背景:给定一个边长为NN,数值由行列递增的方阵,问顺时针螺旋来读的话,第KK位数字是什么?要求算法能够在合理时间内求解N=30000000N=30000000量级的输入。

例如一个N=3N=3的方阵,数值就为:

123456789\begin{matrix}1&2&3\\4&5&6\\7&8&9\\\end{matrix}

此时顺时针螺旋读的话,顺序就是1,2,3,6,9,8,7,4,51,2,3,6,9,8,7,4,5,所以对于N=3,K=4N=3,K=4的情况得到的结果就是66

输入:两个整数N,KN,K

输出:一个整数

我的解答:首先确定对于KK个数,是属于第几个圈,之后确定第KK个数的行和列,然后计算得到结果。

时间复杂度为O(logN)O(\log N),如果不考虑计算开方的复杂度可以认为是O(1)O(1)的。

八、总结

笔试时长一共4个小时,除去高中竞赛之后好像基本没考过这么长的考试了。内容基本上全是智力测试,专业知识(金融知识)的考察基本为零,我觉得毫无准备硬冲基本非常容易一脸懵逼(对就是我)。不知道能不能过笔试(菜狗哭泣)。

Update

我这空了一道大题,编程算法题复杂度还写高了的居然过了笔试,不愧是我。